Science.gov

Sample records for cis-regulatory motif directs

  1. Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs

    PubMed Central

    Ivan, Andra; Halfon, Marc S; Sinha, Saurabh

    2008-01-01

    We consider the problem of predicting cis-regulatory modules without knowledge of motifs. We formulate this problem in a pragmatic setting, and create over 30 new data sets, using Drosophila modules, to use as a 'benchmark'. We propose two new methods for the problem, and evaluate these, as well as two existing methods, on our benchmark. We find that the challenge of predicting cis-regulatory modules ab initio, without any input of relevant motifs, is a realizable goal. PMID:18226245

  2. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria

    PubMed Central

    2013-01-01

    Background In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels. An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. Results A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. Conclusions The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and

  3. Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation

    PubMed Central

    Rouault, Hervé; Santolini, Marc; Schweisguth, François; Hakim, Vincent

    2014-01-01

    Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages. PMID:24682824

  4. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  5. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    PubMed Central

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-01-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria. PMID:26975728

  6. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses.

    PubMed

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-01-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria. PMID:26975728

  7. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    NASA Astrophysics Data System (ADS)

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-03-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.

  8. Mutagenesis of GATA motifs controlling the endoderm regulator elt-2 reveals distinct dominant and secondary cis-regulatory elements.

    PubMed

    Du, Lawrence; Tracy, Sharon; Rifkin, Scott A

    2016-04-01

    Cis-regulatory elements (CREs) are crucial links in developmental gene regulatory networks, but in many cases, it can be difficult to discern whether similar CREs are functionally equivalent. We found that despite similar conservation and binding capability to upstream activators, different GATA cis-regulatory motifs within the promoter of the C. elegans endoderm regulator elt-2 play distinctive roles in activating and modulating gene expression throughout development. We fused wild-type and mutant versions of the elt-2 promoter to a gfp reporter and inserted these constructs as single copies into the C. elegans genome. We then counted early embryonic gfp transcripts using single-molecule RNA FISH (smFISH) and quantified gut GFP fluorescence. We determined that a single primary dominant GATA motif located 527bp upstream of the elt-2 start codon was necessary for both embryonic activation and later maintenance of transcription, while nearby secondary GATA motifs played largely subtle roles in modulating postembryonic levels of elt-2. Mutation of the primary activating site increased low-level spatiotemporally ectopic stochastic transcription, indicating that this site acts repressively in non-endoderm cells. Our results reveal that CREs with similar GATA factor binding affinities in close proximity can play very divergent context-dependent roles in regulating the expression of a developmentally critical gene in vivo. PMID:26896592

  9. Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

    PubMed Central

    2012-01-01

    Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs) and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF) binding sites (TFBSs). Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was 21-75% more precise than a

  10. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  11. Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules

    PubMed Central

    Boeva, Valentina; Clément, Julien; Régnier, Mireille; Roytberg, Mikhail A; Makeev, Vsevolod J

    2007-01-01

    Background cis-Regulatory modules (CRMs) of eukaryotic genes often contain multiple binding sites for transcription factors. The phenomenon that binding sites form clusters in CRMs is exploited in many algorithms to locate CRMs in a genome. This gives rise to the problem of calculating the statistical significance of the event that multiple sites, recognized by different factors, would be found simultaneously in a text of a fixed length. The main difficulty comes from overlapping occurrences of motifs. So far, no tools have been developed allowing the computation of p-values for simultaneous occurrences of different motifs which can overlap. Results We developed and implemented an algorithm computing the p-value that s different motifs occur respectively k1, ..., ks or more times, possibly overlapping, in a random text. Motifs can be represented with a majority of popular motif models, but in all cases, without indels. Zero or first order Markov chains can be adopted as a model for the random text. The computational tool was tested on the set of cis-regulatory modules involved in D. melanogaster early development, for which there exists an annotation of binding sites for transcription factors. Our test allowed us to correctly identify transcription factors cooperatively/competitively binding to DNA. Method The algorithm that precisely computes the probability of simultaneous motif occurrences is inspired by the Aho-Corasick automaton and employs a prefix tree together with a transition function. The algorithm runs with the O(n|Σ|(m|ℋ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@| + K|σ|K) ∏i ki) time complexity, where n is the length of the text, |Σ| is the alphabet size, m is the maximal motif length, |

  12. Promoter analysis reveals cis-regulatory motifs associated with the expression of the WRKY transcription factor CrWRKY1 in Catharanthus roseus.

    PubMed

    Yang, Zhirong; Patra, Barunava; Li, Runzhi; Pattanaik, Sitakanta; Yuan, Ling

    2013-12-01

    WRKY transcription factors (TFs) are emerging as an important group of regulators of plant secondary metabolism. However, the cis-regulatory elements associated with their regulation have not been well characterized. We have previously demonstrated that CrWRKY1, a member of subgroup III of the WRKY TF family, regulates biosynthesis of terpenoid indole alkaloids in the ornamental and medicinal plant, Catharanthus roseus. Here, we report the isolation and functional characterization of the CrWRKY1 promoter. In silico analysis of the promoter sequence reveals the presence of several potential TF binding motifs, indicating the involvement of additional TFs in the regulation of the TIA pathway. The CrWRKY1 promoter can drive the expression of a β-glucuronidase (GUS) reporter gene in native (C. roseus protoplasts and transgenic hairy roots) and heterologous (transgenic tobacco seedlings) systems. Analysis of 5'- or 3'-end deletions indicates that the sequence located between positions -140 to -93 bp and -3 to +113 bp, relative to the transcription start site, is critical for promoter activity. Mutation analysis shows that two overlapping as-1 elements and a CT-rich motif contribute significantly to promoter activity. The CrWRKY1 promoter is induced in response to methyl jasmonate (MJ) treatment and the promoter region between -230 and -93 bp contains a putative MJ-responsive element. The CrWRKY1 promoter can potentially be used as a tool to isolate novel TFs involved in the regulation of the TIA pathway. PMID:23979312

  13. In planta analysis of a cis-regulatory cytokinin response motif in Arabidopsis and identification of a novel enhancer sequence.

    PubMed

    Ramireddy, Eswarayya; Brenner, Wolfram G; Pfeifer, Andreas; Heyl, Alexander; Schmülling, Thomas

    2013-07-01

    The phytohormone cytokinin plays a key role in regulating plant growth and development, and is involved in numerous physiological responses to environmental changes. The type-B response regulators, which regulate the transcription of cytokinin response genes, are a part of the cytokinin signaling system. Arabidopsis thaliana encodes 11 type-B response regulators (type-B ARRs), and some of them were shown to bind in vitro to the core cytokinin response motif (CRM) 5'-(A/G)GAT(T/C)-3' or, in the case of ARR1, to an extended motif (ECRM), 5'-AAGAT(T/C)TT-3'. Here we obtained in planta proof for the functionality of the latter motif. Promoter deletion analysis of the primary cytokinin response gene ARR6 showed that a combination of two extended motifs within the promoter is required to mediate the full transcriptional activation by ARR1 and other type-B ARRs. CRMs were found to be over-represented in the vicinity of ECRMs in the promoters of cytokinin-regulated genes, suggesting their functional relevance. Moreover, an evolutionarily conserved 27 bp long T-rich region between -220 and -193 bp was identified and shown to be required for the full activation by type-B ARRs and the response to cytokinin. This novel enhancer is not bound by the DNA-binding domain of ARR1, indicating that additional proteins might be involved in mediating the transcriptional cytokinin response. Furthermore, genome-wide expression profiling identified genes, among them ARR16, whose induction by cytokinin depends on both ARR1 and other specific type-B ARRs. This together with the ECRM/CRM sequence clustering indicates cooperative action of different type-B ARRs for the activation of particular target genes. PMID:23620480

  14. Modeling DNA sequence-based cis-regulatory gene networks.

    PubMed

    Bolouri, Hamid; Davidson, Eric H

    2002-06-01

    Gene network analysis requires computationally based models which represent the functional architecture of regulatory interactions, and which provide directly testable predictions. The type of model that is useful is constrained by the particular features of developmentally active cis-regulatory systems. These systems function by processing diverse regulatory inputs, generating novel regulatory outputs. A computational model which explicitly accommodates this basic concept was developed earlier for the cis-regulatory system of the endo16 gene of the sea urchin. This model represents the genetically mandated logic functions that the system executes, but also shows how time-varying kinetic inputs are processed in different circumstances into particular kinetic outputs. The same basic design features can be utilized to construct models that connect the large number of cis-regulatory elements constituting developmental gene networks. The ultimate aim of the network models discussed here is to represent the regulatory relationships among the genomic control systems of the genes in the network, and to state their functional meaning. The target site sequences of the cis-regulatory elements of these genes constitute the physical basis of the network architecture. Useful models for developmental regulatory networks must represent the genetic logic by which the system operates, but must also be capable of explaining the real time dynamics of cis-regulatory response as kinetic input and output data become available. Most importantly, however, such models must display in a direct and transparent manner fundamental network design features such as intra- and intercellular feedback circuitry; the sources of parallel inputs into each cis-regulatory element; gene battery organization; and use of repressive spatial inputs in specification and boundary formation. Successful network models lead to direct tests of key architectural features by targeted cis-regulatory analysis. PMID

  15. Experimental validation of predicted mammalian erythroid cis-regulatory modules

    PubMed Central

    Wang, Hao; Zhang, Ying; Cheng, Yong; Zhou, Yuepin; King, David C.; Taylor, James; Chiaromonte, Francesca; Kasturi, Jyotsna; Petrykowska, Hanna; Gibb, Brian; Dorman, Christine; Miller, Webb; Dore, Louis C.; Welch, John; Weiss, Mitchell J.; Hardison, Ross C.

    2006-01-01

    Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%–100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/. PMID:17038566

  16. A method for using direct injection of plasmid DNA to study cis-regulatory element activity in F0 Xenopus embryos and tadpoles.

    PubMed

    Wang, Chen; Szaro, Ben G

    2015-02-01

    The ability to express exogenous reporter genes in intact, externally developing embryos, such as Xenopus, is a powerful tool for characterizing the activity of cis-regulatory gene elements during development. Although methods exist for generating transgenic Xenopus lines, more simplified methods for use with F0 animals would significantly speed the characterization of these elements. We discovered that injecting 2-cell stage embryos with a plasmid bearing a ϕC31 integrase-targeted attB element and two dual β-globin HS4 insulators flanking a reporter transgene in opposite orientations relative to each other yielded persistent expression with sufficiently high penetrance for characterizing the activity of the promoter without having to coinject integrase RNA. Expression began appropriately during development and persisted into swimming tadpole stages without perturbing the expression of the cognate endogenous gene. Coinjected plasmids having the same elements but expressing different reporter proteins were reliably coexpressed within the same cells, providing a useful control for variations in injections between animals. To overcome the high propensity of these plasmids to undergo recombination, we developed a method for generating them using conventional cloning methods and DH5α cells for propagation. We conclude that this method offers a convenient and reliable way to evaluate the activity of cis-regulatory gene elements in the intact F0 embryo. PMID:25448690

  17. The role of cis regulatory evolution in maize domestication.

    PubMed

    Lemmon, Zachary H; Bukowski, Robert; Sun, Qi; Doebley, John F

    2014-11-01

    Gene expression differences between divergent lineages caused by modification of cis regulatory elements are thought to be important in evolution. We assayed genome-wide cis and trans regulatory differences between maize and its wild progenitor, teosinte, using deep RNA sequencing in F1 hybrid and parent inbred lines for three tissue types (ear, leaf and stem). Pervasive regulatory variation was observed with approximately 70% of ∼17,000 genes showing evidence of regulatory divergence between maize and teosinte. However, many fewer genes (1,079 genes) show consistent cis differences with all sampled maize and teosinte lines. For ∼70% of these 1,079 genes, the cis differences are specific to a single tissue. The number of genes with cis regulatory differences is greatest for ear tissue, which underwent a drastic transformation in form during domestication. As expected from the domestication bottleneck, maize possesses less cis regulatory variation than teosinte with this deficit greatest for genes showing maize-teosinte cis regulatory divergence, suggesting selection on cis regulatory differences during domestication. Consistent with selection on cis regulatory elements, genes with cis effects correlated strongly with genes under positive selection during maize domestication and improvement, while genes with trans regulatory effects did not. We observed a directional bias such that genes with cis differences showed higher expression of the maize allele more often than the teosinte allele, suggesting domestication favored up-regulation of gene expression. Finally, this work documents the cis and trans regulatory changes between maize and teosinte in over 17,000 genes for three tissues. PMID:25375861

  18. Cis-regulatory mutations in human disease

    PubMed Central

    2009-01-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few. PMID:19641089

  19. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

    PubMed Central

    Li, Mingjun

    2014-01-01

    Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs) with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs) associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb) of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes. PMID:25250331

  20. Discovering cis-regulatory RNAs in Shewanella genomes by Support Vector Machines.

    PubMed

    Xu, Xing; Ji, Yongmei; Stormo, Gary D

    2009-04-01

    An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5' untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5'-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs

  1. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    PubMed Central

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  2. Global identification of the genetic networks and cis-regulatory elements of the cold response in zebrafish

    PubMed Central

    Hu, Peng; Liu, Mingli; Zhang, Dong; Wang, Jinfeng; Niu, Hongbo; Liu, Yimeng; Wu, Zhichao; Han, Bingshe; Zhai, Wanying; Shen, Yu; Chen, Liangbiao

    2015-01-01

    The transcriptional programs of ectothermic teleosts are directly influenced by water temperature. However, the cis- and trans-factors governing cold responses are not well characterized. We profiled transcriptional changes in eight zebrafish tissues exposed to mildly and severely cold temperatures using RNA-Seq. A total of 1943 differentially expressed genes (DEGs) were identified, from which 34 clusters representing distinct tissue and temperature response expression patterns were derived using the k-means fuzzy clustering algorithm. The promoter regions of the clustered DEGs that demonstrated strong co-regulation were analysed for enriched cis-regulatory elements with a motif discovery program, DREME. Seventeen motifs, ten known and seven novel, were identified, which covered 23% of the DEGs. Two motifs predicted to be the binding sites for the transcription factors Bcl6 and Jun, respectively, were chosen for experimental verification, and they demonstrated the expected cold-induced and cold-repressed patterns of gene regulation. Protein interaction modeling of the network components followed by experimental validation suggested that Jun physically interacts with Bcl6 and might be a hub factor that orchestrates the cold response in zebrafish. Thus, the methodology used and the regulatory networks uncovered in this study provide a foundation for exploring the mechanisms of cold adaptation in teleosts. PMID:26227973

  3. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  4. A Cis-Regulatory Map of the Drosophila Genome

    PubMed Central

    Nègre, Nicolas; Brown, Christopher D.; Ma, Lijia; Bristow, Christopher Aaron; Miller, Steven W.; Wagner, Ulrich; Kheradpour, Pouya; Eaton, Matthew L.; Loriaux, Paul; Sealfon, Rachel; Li, Zirong; Ishii, Haruhiko; Spokony, Rebecca F.; Chen, Jia; Hwang, Lindsay; Cheng, Chao; Auburn, Richard P.; Davis, Melissa B.; Domanus, Marc; Shah, Parantu K.; Morrison, Carolyn A.; Zieba, Jennifer; Suchy, Sarah; Senderowicz, Lionel; Victorsen, Alec; Bild, Nicholas A.; Grundstad, A. Jason; Hanley, David; MacAlpine, David M.; Mannervik, Mattias; Venken, Koen; Bellen, Hugo; White, Robert; Russell, Steven; Grossman, Robert L.; Ren, Bing; Gerstein, Mark; Posakony, James W.; Kellis, Manolis; White, Kevin P.

    2011-01-01

    Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide 1,2 has successfully identified specific subtypes of regulatory elements 3. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb-Response Elements 4, chromatin states 5, transcription factor binding sites (TFBS) 6–9, PolII regulation 8, and insulator elements 10; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome based on more than 300 chromatin immuno-precipitation (ChIP) datasets for eight chromatin features, five histone deacetylases (HDACs) and thirty-eight site-specific transcription factors (TFs) at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and we validated a subset of predictions for promoters, enhancers, and insulators in vivo. We also identified nearly 2,000 genomic regions of dense TF binding associated with chromatin activity and accessibility. We discovered hundreds of new TF co-binding relationships and defined a TF network with over 800 potential regulatory relationships. PMID:21430782

  5. Abundant raw material for cis-regulatory evolution in humans

    NASA Technical Reports Server (NTRS)

    Rockman, Matthew V.; Wray, Gregory A.

    2002-01-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  6. A Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes

    PubMed Central

    Yao, Zizhen; Barrick, Jeffrey; Weinberg, Zasha; Neph, Shane; Breaker, Ronald; Tompa, Martin; Ruzzo, Walter L

    2007-01-01

    Noncoding RNAs (ncRNAs) are important functional RNAs that do not code for proteins. We present a highly efficient computational pipeline for discovering cis-regulatory ncRNA motifs de novo. The pipeline differs from previous methods in that it is structure-oriented, does not require a multiple-sequence alignment as input, and is capable of detecting RNA motifs with low sequence conservation. We also integrate RNA motif prediction with RNA homolog search, which improves the quality of the RNA motifs significantly. Here, we report the results of applying this pipeline to Firmicute bacteria. Our top-ranking motifs include most known Firmicute elements found in the RNA family database (Rfam). Comparing our motif models with Rfam's hand-curated motif models, we achieve high accuracy in both membership prediction and base-pair–level secondary structure prediction (at least 75% average sensitivity and specificity on both tasks). Of the ncRNA candidates not in Rfam, we find compelling evidence that some of them are functional, and analyze several potential ribosomal protein leaders in depth. PMID:17616982

  7. Cis-regulatory architecture of a brain signaling center predates the origin of chordates.

    PubMed

    Yao, Yao; Minor, Paul J; Zhao, Ying-Tao; Jeong, Yongsu; Pani, Ariel M; King, Anna N; Symmons, Orsolya; Gan, Lin; Cardoso, Wellington V; Spitz, François; Lowe, Christopher J; Epstein, Douglas J

    2016-05-01

    Genomic approaches have predicted hundreds of thousands of tissue-specific cis-regulatory sequences, but the determinants critical to their function and evolutionary history are mostly unknown. Here we systematically decode a set of brain enhancers active in the zona limitans intrathalamica (zli), a signaling center essential for vertebrate forebrain development via the secreted morphogen Sonic hedgehog (Shh). We apply a de novo motif analysis tool to identify six position-independent sequence motifs together with their cognate transcription factors that are essential for zli enhancer activity and Shh expression in the mouse embryo. Using knowledge of this regulatory lexicon, we discover new Shh zli enhancers in mice and a functionally equivalent element in hemichordates, indicating an ancient origin of the Shh zli regulatory network that predates the chordate phylum. These findings support a strategy for delineating functionally conserved enhancers in the absence of overt sequence homologies and over extensive evolutionary distances. PMID:27064252

  8. Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes

    PubMed Central

    Zhang, Shaoqiang; Xu, Minli; Su, Zhengchang

    2009-01-01

    Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity. PMID:19383880

  9. CREME: Cis-Regulatory Module Explorer for the Human Genome

    SciTech Connect

    Loots, G G; Sharan, R; Ovcharenko, I; Ben-Hur, A

    2004-02-11

    The binding of transcription factors to specific regulatory sequence elements is a primary mechanism for controlling gene transcription. Eukaryotic genes are often regulated by several transcription factors, whose binding sites are tightly clustered and form cis-regulatory modules. In this paper we present a web-server, CREME, for identifying and visualizing cis-regulatory modules in the promoter regions of a given set of potentially co-regulated genes. CREME relies on a database of putative transcription factor binding sites that have been annotated across the human genome using a library of position weight matrices and evolutionary conservation with the mouse and rat genomes. A search algorithm is applied to this dataset to identify combinations of transcription factors whose binding sites tend to co-occur in close proximity in the promoter regions of the input gene set. The identified cis-regulatory modules are statistically scored and significant combinations are reported and graphically visualized. Our web-server is available at http://creme.dcode.org/.

  10. cis-Regulatory control of the initial neurogenic pattern of onecut gene expression in the sea urchin embryo.

    PubMed

    Barsi, Julius C; Davidson, Eric H

    2016-01-01

    Specification of the ciliated band (CB) of echinoid embryos executes three spatial functions essential for postgastrular organization. These are establishment of a band about 5 cells wide which delimits and bounds other embryonic territories; definition of a neurogenic domain within this band; and generation within it of arrays of ciliary cells that bear the special long cilia from which the structure derives its name. In Strongylocentrotus purpuratus the spatial coordinates of the future ciliated band are initially and exactly determined by the disposition of a ring of cells that transcriptionally activate the onecut homeodomain regulatory gene, beginning in blastula stage, long before the appearance of the CB per se. Thus the cis-regulatory apparatus that governs onecut expression in the blastula directly reveals the genomic sequence code by which these aspects of the spatial organization of the embryo are initially determined. We screened the entire onecut locus and its flanking region for transcriptionally active cis-regulatory elements, and by means of BAC recombineered deletions identified three separated and required cis-regulatory modules that execute different functions. The operating logic of the crucial spatial control module accounting for the spectacularly precise and beautiful early onecut expression domain depends on spatial repression. Previously predicted oral ectoderm and aboral ectoderm repressors were identified by cis-regulatory mutation as the products of goosecoid and irxa genes respectively, while the pan-ectodermal activator SoxB1 supplies a transcriptional driver function. PMID:26522848

  11. A primer on regression methods for decoding cis-regulatory logic

    SciTech Connect

    Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.

    2009-03-03

    The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain

  12. Epistatic Interactions in the Arabinose Cis-Regulatory Element

    PubMed Central

    Lagator, Mato; Igler, Claudia; Moreno, Anaísa B.; Guet, Călin C.; Bollback, Jonathan P.

    2016-01-01

    Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner. PMID:26589997

  13. Epistatic Interactions in the Arabinose Cis-Regulatory Element.

    PubMed

    Lagator, Mato; Igler, Claudia; Moreno, Anaísa B; Guet, Călin C; Bollback, Jonathan P

    2016-03-01

    Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner. PMID:26589997

  14. Evolution of lineage-specific functions in ancient cis-regulatory modules.

    PubMed

    Pauls, Stefan; Goode, Debbie K; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-11-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs. PMID:26538567

  15. The Hematopoietic Stem and Progenitor Cell Cistrome: GATA Factor-Dependent cis-Regulatory Mechanisms.

    PubMed

    Hewitt, K J; Johnson, K D; Gao, X; Keles, S; Bresnick, E H

    2016-01-01

    Transcriptional regulators mediate the genesis and function of the hematopoietic system by binding complex ensembles of cis-regulatory elements to establish genetic networks. While thousands to millions of any given cis-element resides in a genome, how transcriptional regulators select these sites and how site attributes dictate functional output is not well understood. An instructive system to address this problem involves the GATA family of transcription factors that control vital developmental and physiological processes and are linked to multiple human pathologies. Although GATA factors bind DNA motifs harboring the sequence GATA, only a very small subset of these abundant motifs are occupied in genomes. Mechanistic studies revealed a unique configuration of a GATA factor-regulated cis-element consisting of an E-box and a downstream GATA motif separated by a short DNA spacer. GATA-1- or GATA-2-containing multiprotein complexes at these composite elements control transcription of genes critical for hematopoietic stem cell emergence in the mammalian embryo, hematopoietic progenitor cell regulation, and erythroid cell maturation. Other constituents of the complex include the basic helix-loop-loop transcription factor Scl/TAL1, its heterodimeric partner E2A, and the Lim domain proteins LMO2 and LDB1. This chapter reviews the structure/function of E-box-GATA composite cis-elements, which collectively constitute an important sector of the hematopoietic stem and progenitor cell cistrome. PMID:27137654

  16. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  17. Identification of tissue-specific cis-regulatory modules based on interactions between transcription factors

    PubMed Central

    Yu, Xueping; Lin, Jimmy; Zack, Donald J; Qian, Jiang

    2007-01-01

    Background Evolutionary conservation has been used successfully to help identify cis-acting DNA regions that are important in regulating tissue-specific gene expression. Motivated by increasing evidence that some DNA regulatory regions are not evolutionary conserved, we have developed an approach for cis-regulatory region identification that does not rely upon evolutionary sequence conservation. Results The conservation-independent approach is based on an empirical potential energy between interacting transcription factors (TFs). In this analysis, the potential energy is defined as a function of the number of TF interactions in a genomic region and the strength of the interactions. By identifying sets of interacting TFs, the analysis locates regions enriched with the binding sites of these interacting TFs. We applied this approach to 30 human tissues and identified 6232 putative cis-regulatory modules (CRMs) regulating 2130 tissue-specific genes. Interestingly, some genes appear to be regulated by different CRMs in different tissues. Known regulatory regions are highly enriched in our predicted CRMs. In addition, DNase I hypersensitive sites, which tend to be associated with active regulatory regions, significantly overlap with the predicted CRMs, but not with more conserved regions. We also find that conserved and non-conserved CRMs regulate distinct gene groups. Conserved CRMs control more essential genes and genes involved in fundamental cellular activities such as transcription. In contrast, non-conserved CRMs, in general, regulate more non-essential genes, such as genes related to neural activity. Conclusion These results demonstrate that identifying relevant sets of binding motifs can help in the mapping of DNA regulatory regions, and suggest that non-conserved CRMs play an important role in gene regulation. PMID:17996093

  18. Computational discovery of soybean promoter cis-regulatory elements for the construction of soybean cyst nematode-inducible synthetic promoters.

    PubMed

    Liu, Wusheng; Mazarei, Mitra; Peng, Yanhui; Fethe, Michael H; Rudis, Mary R; Lin, Jingyu; Millwood, Reginald J; Arelli, Prakash R; Stewart, Charles Neal

    2014-10-01

    Computational methods offer great hope but limited accuracy in the prediction of functional cis-regulatory elements; improvements are needed to enable synthetic promoter design. We applied an ensemble strategy for de novo soybean cyst nematode (SCN)-inducible motif discovery among promoters of 18 co-expressed soybean genes that were selected from six reported microarray studies involving a compatible soybean-SCN interaction. A total of 116 overlapping motif regions (OMRs) were discovered bioinformatically that were identified by at least four out of seven bioinformatic tools. Using synthetic promoters, the inducibility of each OMR or motif itself was evaluated by co-localization of gain of function of an orange fluorescent protein reporter and the presence of SCN in transgenic soybean hairy roots. Among 16 OMRs detected from two experimentally confirmed SCN-inducible promoters, 11 OMRs (i.e. 68.75%) were experimentally confirmed to be SCN-inducible, leading to the discovery of 23 core motifs of 5- to 7-bp length, of which 14 are novel in plants. We found that a combination of the three best tools (i.e. SCOPE, W-AlignACE and Weeder) could detect all 23 core motifs. Thus, this strategy is a high-throughput approach for de novo motif discovery in soybean and offers great potential for novel motif discovery and synthetic promoter engineering for any plant and trait in crop biotechnology. PMID:24893752

  19. Deciphering cis-regulatory control in inflammatory cells.

    PubMed

    Ghisletti, Serena; Natoli, Gioacchino

    2013-01-01

    In innate immune system cells, such as macrophages and dendritic cells, deployment of inducible gene expression programmes in response to microbes and danger signals requires highly precise regulatory mechanisms. The inflammatory response has to be tailored based on both the triggering stimulus and its dose, and it has to be unfolded in a kinetically complex manner that suits the different phases of the inflammatory process. Genomic characterization of regulatory elements in this context indicated that transcriptional regulators involved in macrophage specification act as pioneer transcription factors (TFs) that generate regions of open chromatin that enable the recruitment of TFs activated in response to external inputs. Therefore, competence for responses to a specific stimulus is programmed at an early stage of differentiation by factors involved in lineage commitment and maintenance of cell identity, which are responsible for the organization of a cell-type-specific cis-regulatory repertoire. The basic functional and organizational principles that regulate inflammatory gene expression in professional cells of the innate immune system provide general paradigms on the interplay between differentiation and environmental responses. PMID:23650641

  20. Detailed map of a cis-regulatory input function

    NASA Astrophysics Data System (ADS)

    Setty, Y.; Mayo, A. E.; Surette, M. G.; Alon, U.

    2003-06-01

    Most genes are regulated by multiple transcription factors that bind specific sites in DNA regulatory regions. These cis-regulatory regions perform a computation: the rate of transcription is a function of the active concentrations of each of the input transcription factors. Here, we used accurate gene expression measurements from living cell cultures, bearing GFP reporters, to map in detail the input function of the classic lacZYA operon of Escherichia coli, as a function of about a hundred combinations of its two inducers, cAMP and isopropyl -D-thiogalactoside (IPTG). We found an unexpectedly intricate function with four plateau levels and four thresholds. This result compares well with a mathematical model of the binding of the regulatory proteins cAMP receptor protein (CRP) and LacI to the lac regulatory region. The model is also used to demonstrate that with few mutations, the same region could encode much purer AND-like or even OR-like functions. This possibility means that the wild-type region is selected to perform an elaborate computation in setting the transcription rate. The present approach can be generally used to map the input functions of other genes.

  1. CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

    PubMed Central

    Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  2. Overview Article: Identifying transcriptional cis-regulatory modules in animal genomes

    PubMed Central

    Suryamohan, Kushal; Halfon, Marc S.

    2014-01-01

    Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908

  3. Directed network motifs in Alzheimer's disease and mild cognitive impairment.

    PubMed

    Friedman, Eric J; Young, Karl; Tremper, Graham; Liang, Jason; Landsberg, Adam S; Schuff, Norbert

    2015-01-01

    Directed network motifs are the building blocks of complex networks, such as human brain networks, and capture deep connectivity information that is not contained in standard network measures. In this paper we present the first application of directed network motifs in vivo to human brain networks, utilizing recently developed directed progression networks which are built upon rates of cortical thickness changes between brain regions. This is in contrast to previous studies which have relied on simulations and in vitro analysis of non-human brains. We show that frequencies of specific directed network motifs can be used to distinguish between patients with Alzheimer's disease (AD) and normal control (NC) subjects. Especially interesting from a clinical standpoint, these motif frequencies can also distinguish between subjects with mild cognitive impairment who remained stable over three years (MCI) and those who converted to AD (CONV). Furthermore, we find that the entropy of the distribution of directed network motifs increased from MCI to CONV to AD, implying that the distribution of pathology is more structured in MCI but becomes less so as it progresses to CONV and further to AD. Thus, directed network motifs frequencies and distributional properties provide new insights into the progression of Alzheimer's disease as well as new imaging markers for distinguishing between normal controls, stable mild cognitive impairment, MCI converters and Alzheimer's disease. PMID:25879535

  4. Putative cis-Regulatory Elements Associated with Heat Shock Genes Activated During Excystation of Cryptosporidium parvum

    PubMed Central

    Lara, Ana M.; Serrano, Myrna; Sheth, Nihar; Buck, Gregory

    2010-01-01

    Background Cryptosporidiosis is a ubiquitous infectious disease, caused by the protozoan parasites Cryptosporidium hominis and C. parvum, leading to acute, persistent and chronic diarrhea worldwide. Although the complications of this disease can be serious, even fatal, in immunocompromised patients of any age, they have also been found to lead to long term effects, including growth inhibition and impaired cognitive development, in infected immunocompetent children. The Cryptosporidium life cycle alternates between a dormant stage, the oocyst, and a highly replicative phase that includes both asexual vegetative stages as well as sexual stages, implying fine genetic regulatory mechanisms. The parasite is extremely difficult to study because it cannot be cultured in vitro and animal models are equally challenging. The recent publication of the genome sequence of C. hominis and C. parvum has, however, significantly advanced our understanding of the biology and pathogenesis of this parasite. Methodology/Principal Findings Herein, our goal was to identify cis-regulatory elements associated with heat shock response in Cryptosporidium using a combination of in silico and real time RT-PCR strategies. Analysis with Gibbs-Sampling algorithms of upstream non-translated regions of twelve genes annotated as heat shock proteins in the Cryptosporidium genome identified a highly conserved over-represented sequence motif in eleven of them. RT-PCR analyses, described herein and also by others, show that these eleven genes bearing the putative element are induced concurrent with excystation of parasite oocysts via heat shock. Conclusions/Significance Our analyses suggest that occurrences of a motif identified in the upstream regions of the Cryptosporidium heat shock genes represent parts of the transcriptional apparatus and function as stress response elements that activate expression of these genes during excystation, and possibly at other stages in the life cycle of the parasite

  5. Direct vs 2-stage approaches to structured motif finding

    PubMed Central

    2012-01-01

    Background The notion of DNA motif is a mathematical abstraction used to model regions of the DNA (known as Transcription Factor Binding Sites, or TFBSs) that are bound by a given Transcription Factor to regulate gene expression or repression. In turn, DNA structured motifs are a mathematical counterpart that models sets of TFBSs that work in concert in the gene regulations processes of higher eukaryotic organisms. Typically, a structured motif is composed of an ordered set of isolated (or simple) motifs, separated by a variable, but somewhat constrained number of “irrelevant” base-pairs. Discovering structured motifs in a set of DNA sequences is a computationally hard problem that has been addressed by a number of authors using either a direct approach, or via the preliminary identification and successive combination of simple motifs. Results We describe a computational tool, named SISMA, for the de-novo discovery of structured motifs in a set of DNA sequences. SISMA is an exact, enumerative algorithm, meaning that it finds all the motifs conforming to the specifications. It does so in two stages: first it discovers all the possible component simple motifs, then combines them in a way that respects the given constraints. We developed SISMA mainly with the aim of understanding the potential benefits of such a 2-stage approach w.r.t. direct methods. In fact, no 2-stage software was available for the general problem of structured motif discovery, but only a few tools that solved restricted versions of the problem. We evaluated SISMA against other published tools on a comprehensive benchmark made of both synthetic and real biological datasets. In a significant number of cases, SISMA outperformed the competitors, exhibiting a good performance also in most of the cases in which it was inferior. Conclusions A reflection on the results obtained lead us to conclude that a 2-stage approach can be implemented with many advantages over direct approaches. Some of these

  6. Characterization of a putative cis-regulatory element that controls transcriptional activity of the pig uroplakin II gene promoter

    SciTech Connect

    Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi; Cho, Ssang-Goo; Park, Chankyu; Oh, Jae-Wook; Song, Hyuk; Kim, Jae-Hwan; Kim, Jin-Hoi

    2011-07-01

    Highlights: {yields} The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. {yields} The core promoter was located in the 5F-1. {yields} Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. {yields} These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, but little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.

  7. Characterization and identification of cis-regulatory elements in Arabidopsis based on single-nucleotide polymorphism information.

    PubMed

    Korkuc, Paula; Schippers, Jos H M; Walther, Dirk

    2014-01-01

    Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis (Arabidopsis thaliana) accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single-nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degrees of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences. PMID:24204023

  8. Identification and Functional Characterization of Cis-Regulatory Elements Controlling Expression of the Porcine ADRB2 Gene

    PubMed Central

    Jaeger, Alexandra; Fritschka, Stephan; Ponsuksili, Siriluck; Wimmers, Klaus; Muráni, Eduard

    2015-01-01

    The beta-2 adrenergic receptor (beta-2 AR) modulates metabolic processes in skeletal muscle, liver, and adipose tissue in response to catecholamine stimulation. We showed previously that expression of the porcine beta-2 AR gene (ADRB2) is affected by cis-regulatory polymorphisms. These are most likely responsible for the association of ADRB2 with economically relevant muscle-related traits in pigs. The present study focused on characterization of promoter elements involved in basal transcriptional regulation of the porcine ADRB2 in different cell types to aid identification of its cis-regulatory polymorphisms. Based on in silico analysis, luciferase reporter gene assays and gel shift assays were performed using COS-7, HepG2, C2C12, and 3T3-L1 cells. Deletion mapping of the 5´ flanking region (-1324 to +33) of ADRB2 revealed the region between -307 and -269 to be the minimal promoter, including regulatory elements essential for the basal transcriptional activity in all four tested cell types. Directly upstream (-400 to -323) we identified an important enhancer element required for maximal promoter activity. In silico analysis and gel shift assays revealed that this GC-rich element harbors two evolutionarily conserved binding sites of Sp1, a constitutive transcriptional activator. Significant transcriptional activation of the porcine ADRB2 promoter was demonstrated by overexpression of Sp1. Our results demonstrate, for the first time, an important role of Sp1 and of the responsive enhancer element in the regulation of ADRB2 expression. Polymorphisms located in this domain of the porcine ADRB2 promoter represent candidate causal cis-regulatory variants. PMID:26221068

  9. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  10. The search for cis-regulatory driver mutations in cancer genomes.

    PubMed

    Poulos, Rebecca C; Sloane, Mathew A; Hesson, Luke B; Wong, Jason W H

    2015-10-20

    With the advent of high-throughput and relatively inexpensive whole-genome sequencing technology, the focus of cancer research has begun to shift toward analyses of somatic mutations in non-coding cis-regulatory elements of the cancer genome. Cis-regulatory elements play an important role in gene regulation, with mutations in these elements potentially resulting in changes to the expression of linked genes. The recent discoveries of recurrent TERT promoter mutations in melanoma, and recurrent mutations that create a super-enhancer regulating TAL1 expression in T-cell acute lymphoblastic leukaemia (T-ALL), have sparked significant interest in the search for other somatic cis-regulatory mutations driving cancer development. In this review, we look more closely at the TERT promoter and TAL1 enhancer alterations and use these examples to ask whether other cis-regulatory mutations may play a role in cancer susceptibility. In doing so, we make observations from the data emerging from recent research in this field, and describe the experimental and analytical approaches which could be adopted in the hope of better uncovering the true functional significance of somatic cis-regulatory mutations in cancer. PMID:26356674

  11. Study of Cis-regulatory Elements in the Ascidian Ciona intestinalis.

    PubMed

    Irvine, Steven Q

    2013-03-01

    The ascidian (sea squirt) C. intestinalis has become an important model organism for the study of cis-regulation. This is largely due to the technology that has been developed for assessing cis-regulatory activity through the use of transient reporter transgenes introduced into fertilized eggs. This technique allows the rapid and inexpensive testing of endogenous or altered DNA for regulatory activity in vivo. This review examines evidence that C. intestinalis cis-regulatory elements are located more closely to coding regions than in other model organisms. I go on to compare the organization of cis-regulatory elements and conserved non-coding sequences in Ciona, mammals, and other deuterostomes for three representative C.intestinalis genes, Pax6, FoxAa, and the DlxA-B cluster, along with homologs in the other species. These comparisons point out some of the similarities and differences between cis-regulatory elements and their study in the various model organisms. Finally, I provide illustrations of how C. intestinalis lends itself to detailed study of the structure of cis-regulatory elements, which have led, and promise to continue to lead, to important insights into the fundamentals of transcriptional regulation. PMID:23997651

  12. Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression.

    PubMed

    Dunipace, Leslie; Ozdemir, Anil; Stathopoulos, Angelike

    2011-09-01

    It has been shown in several organisms that multiple cis-regulatory modules (CRMs) of a gene locus can be active concurrently to support similar spatiotemporal expression. To understand the functional importance of such seemingly redundant CRMs, we examined two CRMs from the Drosophila snail gene locus, which are both active in the ventral region of pre-gastrulation embryos. By performing a deletion series in a ∼25 kb DNA rescue construct using BAC recombineering and site-directed transgenesis, we demonstrate that the two CRMs are not redundant. The distal CRM is absolutely required for viability, whereas the proximal CRM is required only under extreme conditions such as high temperature. Consistent with their distinct requirements, the CRMs support distinct expression patterns: the proximal CRM exhibits an expanded expression domain relative to endogenous snail, whereas the distal CRM exhibits almost complete overlap with snail except at the anterior-most pole. We further show that the distal CRM normally limits the increased expression domain of the proximal CRM and that the proximal CRM serves as a `damper' for the expression levels driven by the distal CRM. Thus, the two CRMs interact in cis in a non-additive fashion and these interactions may be important for fine-tuning the domains and levels of gene expression. PMID:21813571

  13. Analysis of opo cis-regulatory landscape uncovers Vsx2 requirement in early eye morphogenesis.

    PubMed

    Gago-Rodrigues, Ines; Fernández-Miñán, Ana; Letelier, Joaquin; Naranjo, Silvia; Tena, Juan J; Gómez-Skarmeta, José L; Martinez-Morales, Juan R

    2015-01-01

    The self-organized morphogenesis of the vertebrate optic cup entails coupling the activation of the retinal gene regulatory network to the constriction-driven infolding of the retinal epithelium. Yet the genetic mechanisms underlying this coordination remain largely unexplored. Through phylogenetic footprinting and transgenesis in zebrafish, here we examine the cis-regulatory landscape of opo, an endocytosis regulator essential for eye morphogenesis. Among the different conserved enhancers identified, we isolate a single retina-specific element (H6_10137) and show that its activity depends on binding sites for the retinal determinant Vsx2. Gain- and loss-of-function experiments and ChIP analyses reveal that Vsx2 regulates opo expression through direct binding to this retinal enhancer. Furthermore, we show that vsx2 knockdown impairs the primary optic cup folding. These data support a model by which vsx2, operating through the effector gene opo, acts as a central transcriptional node that coordinates neural retina patterning and optic cup invagination in zebrafish. PMID:25963169

  14. Distinct Functional Constraints Partition Sequence Conservation in a cis-Regulatory Element

    PubMed Central

    Ruvinsky, Ilya

    2011-01-01

    Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. PMID:21655084

  15. Developmental cis-regulatory analysis of the cyclin D gene in the sea urchin Strongylocentrotus purpuratus

    PubMed Central

    McCarty, Christopher M.

    2013-01-01

    Cyclin D genes regulate the cell cycle, growth and differentiation in response to intercellular signaling. While the promoters of vertebrate cyclin D genes have been analyzed, the cis-regulatory sequences across an entire cyclin D locus have not. Doing so would increase understanding of how cyclin D genes respond to the regulatory states established by developmental gene regulatory networks, linking cell cycle and growth control to the ontogenetic program. Therefore, we conducted a cis-regulatory analysis on the cyclin D gene, SpcycD, of the sea urchin, Strongylocentrotus purpuratus, during embryogenesis, identifying upstream and intronic sequences, located within six defined regions bearing one or more cis-regulatory modules each. PMID:24090975

  16. cis-Regulatory Mutations Are a Genetic Cause of Human Limb Malformations

    PubMed Central

    VanderMeer, Julia E.; Ahituv, Nadav

    2011-01-01

    The underlying mutations that cause human limb malformations are often difficult to determine, particularly for limb malformations that occur as isolated traits. Evidence from a variety of studies shows that cis-regulatory mutations, specifically in enhancers, can lead to some of these isolated limb malformations. Here, we provide a review of human limb malformations that have been shown to be caused by enhancer mutations and propose that cis-regulatory mutations will continue to be identified as the cause of additional human malformations as our understanding of regulatory sequences improves. PMID:21509892

  17. Motif-Role-Fingerprints: The Building-Blocks of Motifs, Clustering-Coefficients and Transitivities in Directed Networks

    PubMed Central

    McDonnell, Mark D.; Yaveroğlu, Ömer Nebil; Schmerl, Brett A.; Iannella, Nicolangelo; Ward, Lawrence M.

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File. PMID:25486535

  18. Cis-regulatory mechanisms governing stem and progenitor cell transitions

    PubMed Central

    Johnson, Kirby D.; Kong, Guangyao; Gao, Xin; Chang, Yuan-I; Hewitt, Kyle J.; Sanalkumar, Rajendran; Prathibha, Rajalekshmi; Ranheim, Erik A.; Dewey, Colin N.; Zhang, Jing; Bresnick, Emery H.

    2015-01-01

    Cis-element encyclopedias provide information on phenotypic diversity and disease mechanisms. Although cis-element polymorphisms and mutations are instructive, deciphering function remains challenging. Mutation of an intronic GATA motif (+9.5) in GATA2, encoding a master regulator of hematopoiesis, underlies an immunodeficiency associated with myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). Whereas an inversion relocalizes another GATA2 cis-element (−77) to the proto-oncogene EVI1, inducing EVI1 expression and AML, whether this reflects ectopic or physiological activity is unknown. We describe a mouse strain that decouples −77 function from proto-oncogene deregulation. The −77−/− mice exhibited a novel phenotypic constellation including late embryonic lethality and anemia. The −77 established a vital sector of the myeloid progenitor transcriptome, conferring multipotentiality. Unlike the +9.5−/− embryos, hematopoietic stem cell genesis was unaffected in −77−/− embryos. These results illustrate a paradigm in which cis-elements in a locus differentially control stem and progenitor cell transitions, and therefore the individual cis-element alterations cause unique and overlapping disease phenotypes. PMID:26601269

  19. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis1[OPEN

    PubMed Central

    Guo, Xu Qiu; Adams, Keith L.

    2015-01-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. PMID:26474639

  20. Functional Evolution of cis-Regulatory Modules at a Homeotic Gene in Drosophila

    PubMed Central

    Schiller, Benjamin J.; Bae, Esther; Tran, Diana A.; Shur, Andrey S.; Allen, John M.; Rau, Christoph; Bender, Welcome; Fisher, William W.; Celniker, Susan E.; Drewell, Robert A.

    2009-01-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules. PMID:19893611

  1. Recurrent Modification of a Conserved Cis-Regulatory Element Underlies Fruit Fly Pigmentation Diversity

    PubMed Central

    Rogers, William A.; Salomone, Joseph R.; Tacy, David J.; Camino, Eric M.; Davis, Kristen A.; Rebeiz, Mark; Williams, Thomas M.

    2013-01-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages. PMID:24009528

  2. Evolved tooth gain in sticklebacks is associated with a cis-regulatory allele of Bmp6

    PubMed Central

    Cleves, Phillip A.; Ellis, Nicholas A.; Jimenez, Monica T.; Nunez, Stephanie M.; Schluter, Dolph; Kingsley, David M.; Miller, Craig T.

    2014-01-01

    Developmental genetic studies of evolved differences in morphology have led to the hypothesis that cis-regulatory changes often underlie morphological evolution. However, because most of these studies focus on evolved loss of traits, the genetic architecture and possible association with cis-regulatory changes of gain traits are less understood. Here we show that a derived benthic freshwater stickleback population has evolved an approximate twofold gain in ventral pharyngeal tooth number compared with their ancestral marine counterparts. Comparing laboratory-reared developmental time courses of a low-toothed marine population and this high-toothed benthic population reveals that increases in tooth number and tooth plate area and decreases in tooth spacing arise at late juvenile stages. Genome-wide linkage mapping identifies largely separate sets of quantitative trait loci affecting different aspects of dental patterning. One large-effect quantitative trait locus controlling tooth number fine-maps to a genomic region containing an excellent candidate gene, Bone morphogenetic protein 6 (Bmp6). Stickleback Bmp6 is expressed in developing teeth, and no coding changes are found between the high- and low-toothed populations. However, quantitative allele-specific expression assays of Bmp6 in developing teeth in F1 hybrids show that cis-regulatory changes have elevated the relative expression level of the freshwater benthic Bmp6 allele at late, but not early, stages of stickleback development. Collectively, our data support a model where a late-acting cis-regulatory up-regulation of Bmp6 expression underlies a significant increase in tooth number in derived benthic sticklebacks. PMID:25205810

  3. Cis-Regulatory Changes Associated with a Recent Mating System Shift and Floral Adaptation in Capsella.

    PubMed

    Steige, Kim A; Reimegård, Johan; Koenig, Daniel; Scofield, Douglas G; Slotte, Tanja

    2015-10-01

    The selfing syndrome constitutes a suite of floral and reproductive trait changes that have evolved repeatedly across many evolutionary lineages in response to the shift to selfing. Convergent evolution of the selfing syndrome suggests that these changes are adaptive, yet our understanding of the detailed molecular genetic basis of the selfing syndrome remains limited. Here, we investigate the role of cis-regulatory changes during the recent evolution of the selfing syndrome in Capsella rubella, which split from the outcrosser Capsella grandiflora less than 200 ka. We assess allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. Using a hierarchical Bayesian approach that accounts for technical variation using genomic reads, we find evidence for extensive cis-regulatory changes. On average, 44% of the assayed genes show evidence of ASE; however, only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of cis-regulatory changes in genomic regions responsible for floral and reproductive trait divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions near genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced expression of nearby genes. Our results suggest that cis-regulatory changes have been important during the recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be important for rapid regulatory divergence in association with mating system shifts. PMID:26318184

  4. Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element

    SciTech Connect

    Mao, Grace; Brody, James P.

    2007-11-09

    Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16 bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16 bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014 s{sup -1}. We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase.

  5. The identification of cis-regulatory elements: A review from a machine learning perspective.

    PubMed

    Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

    2015-12-01

    The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. PMID:26499213

  6. Predominant contribution of cis-regulatory divergence in the evolution of mouse alternative splicing

    PubMed Central

    Gao, Qingsong; Sun, Wei; Ballegeer, Marlies; Libert, Claude; Chen, Wei

    2015-01-01

    Divergence of alternative splicing represents one of the major driving forces to shape phenotypic diversity during evolution. However, the extent to which these divergences could be explained by the evolving cis-regulatory versus trans-acting factors remains unresolved. To globally investigate the relative contributions of the two factors for the first time in mammals, we measured splicing difference between C57BL/6J and SPRET/EiJ mouse strains and allele-specific splicing pattern in their F1 hybrid. Out of 11,818 alternative splicing events expressed in the cultured fibroblast cells, we identified 796 with significant difference between the parental strains. After integrating allele-specific data from F1 hybrid, we demonstrated that these events could be predominately attributed to cis-regulatory variants, including those residing at and beyond canonical splicing sites. Contrary to previous observations in Drosophila, such predominant contribution was consistently observed across different types of alternative splicing. Further analysis of liver tissues from the same mouse strains and reanalysis of published datasets on other strains showed similar trends, implying in general the predominant contribution of cis-regulatory changes in the evolution of mouse alternative splicing. PMID:26134616

  7. Role of conserved cis-regulatory elements in the post-transcriptional regulation of the human MECP2 gene involved in autism

    PubMed Central

    2013-01-01

    Background The MECP2 gene codes for methyl CpG binding protein 2 which regulates activities of other genes in the early development of the brain. Mutations in this gene have been associated with Rett syndrome, a form of autism. The purpose of this study was to investigate the role of evolutionarily conserved cis-elements in regulating the post-transcriptional expression of the MECP2 gene and to explore their possible correlations with a mutation that is known to cause mental retardation. Results A bioinformatics approach was used to map evolutionarily conserved cis-regulatory elements in the transcribed regions of the human MECP2 gene and its mammalian orthologs. Cis-regulatory motifs including G-quadruplexes, microRNA target sites, and AU-rich elements have gained significant importance because of their role in key biological processes and as therapeutic targets. We discovered in the 5′-UTR (untranslated region) of MECP2 mRNA a highly conserved G-quadruplex which overlapped a known deletion in Rett syndrome patients with decreased levels of MeCP2 protein. We believe that this 5′-UTR G-quadruplex could be involved in regulating MECP2 translation. We mapped additional evolutionarily conserved G-quadruplexes, microRNA target sites, and AU-rich elements in the key sections of both untranslated regions. Our studies suggest the regulation of translation, mRNA turnover, and development-related alternative MECP2 polyadenylation, putatively involving interactions of conserved cis-regulatory elements with their respective trans factors and complex interactions among the trans factors themselves. We discovered highly conserved G-quadruplex motifs that were more prevalent near alternative splice sites as compared to the constitutive sites of the MECP2 gene. We also identified a pair of overlapping G-quadruplexes at an alternative 5′ splice site that could potentially regulate alternative splicing in a negative as well as a positive way in the MECP2 pre

  8. Variation in vertebrate cis-regulatory elements in evolution and disease.

    PubMed

    Douglas, Adam Thomas; Hill, Robert D

    2014-01-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:25764334

  9. Variation in Vertebrate Cis-Regulatory Elements in Evolution and Disease.

    PubMed

    Douglas, Adam T; Hill, Robert E

    2014-05-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:24802895

  10. Toward a Genome-Wide Reconstruction of Cis-Regulatory Networks in the Human Genome

    PubMed Central

    Cecchini, Katharine R.; Banerjee, A. Raja; Kim, Tae Hoon

    2009-01-01

    The vast amount of recent progress made on the sequence of the human genome has allowed an unprecedented examination of cis-regulatory networks. These networks consist of functional elements such as promoters, enhancers, silencers, and insulators, and their coordinated activity is responsible for regulation of gene expression. Recent studies surveyed the entire genome, identifying novel elements and evaluating functional differences in respect to development. These investigations present the first steps towards a global regulatory map for expression in the human genome. PMID:19560550

  11. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  12. Variation in Vertebrate Cis-Regulatory Elements in Evolution and Disease

    PubMed Central

    Douglas, Adam Thomas; Hill, Robert E

    2014-01-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:25764334

  13. Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking

    PubMed Central

    Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.

    2013-01-01

    The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088

  14. Conservation and Evolution of Cis-Regulatory Systems in Ascomycete Fungi

    PubMed Central

    2004-01-01

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa. PMID:15534694

  15. Conservation and evolution of cis-regulatory systems in ascomycete fungi

    SciTech Connect

    Gasch, Audrey P.; Moses, Alan M.; Chiang, Derek Y.; Fraser, Hunter B.; Berardini, Mark; Eisen, Michael B.

    2004-03-15

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

  16. The structure and evolution of cis-regulatory regions: the shavenbaby story

    PubMed Central

    Stern, David L.; Frankel, Nicolás

    2013-01-01

    In this paper, we provide a historical account of the contribution of a single line of research to our current understanding of the structure of cis-regulatory regions and the genetic basis for morphological evolution. We revisit the experiments that shed light on the evolution of larval cuticular patterns within the genus Drosophila and the evolution and structure of the shavenbaby gene. We describe the experiments that led to the discovery that multiple genetic changes in the cis-regulatory region of shavenbaby caused the loss of dorsal cuticular hairs (quaternary trichomes) in first instar larvae of Drosophila sechellia. We also discuss the experiments that showed that the convergent loss of quaternary trichomes in D. sechellia and Drosophila ezoana was generated by parallel genetic changes in orthologous enhancers of shavenbaby. We discuss the observation that multiple shavenbaby enhancers drive overlapping patterns of expression in the embryo and that these apparently redundant enhancers ensure robust shavenbaby expression and trichome morphogenesis under stressful conditions. All together, these data, collected over 13 years, provide a fundamental case study in the fields of gene regulation and morphological evolution, and highlight the importance of prolonged, detailed studies of single genes. PMID:24218640

  17. Exonic remnants of whole-genome duplication reveal cis-regulatory function of coding exons

    PubMed Central

    Dong, Xianjun; Navratilova, Pavla; Fredman, David; Drivenes, Øyvind; Becker, Thomas S.; Lenhard, Boris

    2010-01-01

    Using a comparative genomics approach to reconstruct the fate of genomic regulatory blocks (GRBs) and identify exonic remnants that have survived the disappearance of their host genes after whole-genome duplication (WGD) in teleosts, we discover a set of 38 candidate cis-regulatory coding exons (RCEs) with predicted target genes. These elements demonstrate evolutionary separation of overlapping protein-coding and regulatory information after WGD in teleosts. We present evidence that the corresponding mammalian exons are still under both coding and non-coding selection pressure, are more conserved than other protein coding exons in the host gene and several control sets, and share key characteristics with highly conserved non-coding elements in the same regions. Their dual function is corroborated by existing experimental data. Additionally, we show examples of human exon remnants stemming from the vertebrate 2R WGD. Our findings suggest that long-range cis-regulatory inputs for developmental genes are not limited to non-coding regions, but can also overlap the coding sequence of unrelated genes. Thus, exonic regulatory elements in GRBs might be functionally equivalent to those in non-coding regions, calling for a re-evaluation of the sequence space in which to look for long-range regulatory elements and experimentally test their activity. PMID:19969543

  18. Engineering Synthetic cis-Regulatory Elements for Simultaneous Recognition of Three Transcriptional Factors in Bacteria.

    PubMed

    Amores, Gerardo Ruiz; Guazzaroni, María-Eugenia; Silva-Rocha, Rafael

    2015-12-18

    Recognition of cis-regulatory elements by transcription factors (TF) at target promoters is crucial to gene regulation in bacteria. In this process, binding of TFs to their cognate sequences depends on a set of physical interactions between these proteins and specific nucleotides in the operator region. Previously, we showed that in silico optimization algorithms are able to generate short sequences that are recognized by two different TFs of Escherichia coli, namely, CRP and IHF, thus generating an AND logic gate. Here, we expanded this approach in order to engineer DNA sequences that can be simultaneously recognized by three unrelated TFs (CRP, IHF, and Fis). Using in silico optimization and experimental validation strategies, we were able to obtain a candidate promoter (Plac-CFI1) regulated by only two TFs with an AND logic, thus demonstrating a limitation in the design. Subsequently, we modified the algorithm to allow the optimization of extended sequences, and were able to design two synthetic promoters (PCFI20-1 and PCFI22-5) that were functional in vivo. Expression assays in E. coli mutant strains for each TF revealed that while CRP positively regulates the promoter activities, IHF and Fis are strong repressors of both the promoter variants. Taken together, our results demonstrate the potential of in silico strategies in bacterial synthetic promoter engineering. Furthermore, the study also shows how small modifications in cis-regulatory elements can drastically affect the final logic of the resulting promoter. PMID:26305598

  19. Recent mating-system evolution in Eichhornia is accompanied by cis-regulatory divergence.

    PubMed

    Arunkumar, Ramesh; Maddison, Teresa I; Barrett, Spencer C H; Wright, Stephen I

    2016-07-01

    The evolution of predominant self-fertilization from cross-fertilization in plants is accompanied by diverse changes to morphology, ecology and genetics, some of which likely result from regulatory changes in gene expression. We examined changes in gene expression during early stages in the transition to selfing in populations of animal-pollinated Eichhornia paniculata with contrasting mating patterns. We crossed plants from outcrossing and selfing populations and tested for the presence of allele-specific expression (ASE) in floral buds and leaf tissue of F1 offspring, indicative of cis-regulatory changes. We identified 1365 genes exhibiting ASE in floral buds and leaf tissue. These genes preferentially expressed alleles from outcrossing parents. Moreover, we found evidence that genes exhibiting ASE had a greater nonsynonymous diversity compared to synonymous diversity in the selfing parents. Our results suggest that the transition from outcrossing to high rates of self-fertilization may have the potential to shape the cis-regulatory genomic landscape of angiosperm species, but that the changes in ASE may be moderate, particularly during the early stages of this transition. PMID:26990568

  20. The evolution of cichlid fish egg-spots is linked with a cis-regulatory change

    PubMed Central

    Santos, M. Emília; Braasch, Ingo; Boileau, Nicolas; Meyer, Britta S.; Sauteur, Loïc; Böhne, Astrid; Belting, Heinz-Georg; Affolter, Markus; Salzburger, Walter

    2014-01-01

    The origin of novel phenotypic characters is a key component in organismal diversification; yet, the mechanisms underlying the emergence of such evolutionary novelties are largely unknown. Here we examine the origin of egg-spots, an evolutionary innovation of the most species-rich group of cichlids, the haplochromines, where these conspicuous male fin colour markings are involved in mating. Applying a combination of RNAseq, comparative genomics and functional experiments, we identify two novel pigmentation genes, fhl2a and fhl2b, and show that especially the more rapidly evolving b-paralog is associated with egg-spot formation. We further find that egg-spot bearing haplochromines, but not other cichlids, feature a transposable element in the cis-regulatory region of fhl2b. Using transgenic zebrafish, we finally demonstrate that this region shows specific enhancer activities in iridophores, a type of pigment cells found in egg-spots, suggesting that a cis-regulatory change is causally linked to the gain of expression in egg-spot bearing haplochromines. PMID:25296686

  1. Transcription of Mammalian cis-Regulatory Elements Is Restrained by Actively Enforced Early Termination.

    PubMed

    Austenaa, Liv M I; Barozzi, Iros; Simonatto, Marta; Masella, Silvia; Della Chiara, Giulia; Ghisletti, Serena; Curina, Alessia; de Wit, Elzo; Bouwman, Britta A M; de Pretis, Stefano; Piccolo, Viviana; Termanini, Alberto; Prosperini, Elena; Pelizzola, Mattia; de Laat, Wouter; Natoli, Gioacchino

    2015-11-01

    Upon recruitment to active enhancers and promoters, RNA polymerase II (Pol II) generates short non-coding transcripts of unclear function. The mechanisms that control the length and the amount of ncRNAs generated by cis-regulatory elements are largely unknown. Here, we show that the adaptor protein WDR82 and its associated complexes actively limit such non-coding transcription. WDR82 targets the SET1 H3K4 methyltransferases and the nuclear protein phosphatase 1 (PP1) complexes to the initiating Pol II. WDR82 and PP1 also interact with components of the transcriptional termination and RNA processing machineries. Depletion of WDR82, SET1, or the PP1 subunit required for its nuclear import caused distinct but overlapping transcription termination defects at highly expressed genes and active enhancers and promoters, thus enabling the increased synthesis of unusually long ncRNAs. These data indicate that transcription initiated from cis-regulatory elements is tightly coordinated with termination mechanisms that impose the synthesis of short RNAs. PMID:26593720

  2. Distance and Helical Phase Dependence of Synergistic Transcription Activation in cis-Regulatory Module

    PubMed Central

    Huang, Qilai; Gong, Chenguang; Li, Jiahuang; Zhuo, Zhu; Chen, Yuan; Wang, Jin; Hua, Zi-Chun

    2012-01-01

    Deciphering of the spatial and stereospecific constraints on synergistic transcription activation mediated between activators bound to cis-regulatory elements is important for understanding gene regulation and remains largely unknown. It has been commonly believed that two activators will activate transcription most effectively when they are bound on the same face of DNA double helix and within a boundary distance from the transcription initiation complex attached to the TATA box. In this work, we studied the spatial and stereospecific constraints on activation by multiple copies of bound model activators using a series of engineered relative distances and stereospecific orientations. We observed that multiple copies of the activators GAL4-VP16 and ZEBRA bound to engineered promoters activated transcription more effectively when bound on opposite faces of the DNA double helix. This phenomenon was not affected by the spatial relationship between the proximal activator and initiation complex. To explain these results, we proposed the novel concentration field model, which posits the effective concentration of bound activators, and therefore the transcription activation potential, is affected by their stereospecific positioning. These results could be used to understand synergistic transcription activation anew and to aid the development of predictive models for the identification of cis-regulatory elements. PMID:22299056

  3. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. PMID:19895802

  4. Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape

    PubMed Central

    Verdin, Hannah; Fernández-Miñán, Ana; Benito-Sanz, Sara; Janssens, Sandra; Callewaert, Bert; Waele, Kathleen De; Schepper, Jean De; François, Inge; Menten, Björn; Heath, Karen E.; Gómez-Skarmeta, José Luis; Baere, Elfride De

    2015-01-01

    Genetic defects such as copy number variations (CNVs) in non-coding regions containing conserved non-coding elements (CNEs) outside the transcription unit of their target gene, can underlie genetic disease. An example of this is the short stature homeobox (SHOX) gene, regulated by seven CNEs located downstream and upstream of SHOX, with proven enhancer capacity in chicken limbs. CNVs of the downstream CNEs have been reported in many idiopathic short stature (ISS) cases, however, only recently have a few CNVs of the upstream enhancers been identified. Here, we set out to provide insight into: (i) the cis-regulatory role of these upstream CNEs in human cells, (ii) the prevalence of upstream CNVs in ISS, and (iii) the chromatin architecture of the SHOX cis-regulatory landscape in chicken and human cells. Firstly, luciferase assays in human U2OS cells, and 4C-seq both in chicken limb buds and human U2OS cells, demonstrated cis-regulatory enhancer capacities of the upstream CNEs. Secondly, CNVs of these upstream CNEs were found in three of 501 ISS patients. Finally, our 4C-seq interaction map of the SHOX region reveals a cis-regulatory domain spanning more than 1 Mb and harbouring putative new cis-regulatory elements. PMID:26631348

  5. The cis-regulatory system of the tbrain gene: alternative use of multiple modules to promote skeletogenic expression in the sea urchin embryo

    PubMed Central

    Wahl, Mary E.; Hahn, Julie; Gora, Kasia; Davidson, Eric H.; Oliveri, Paola

    2009-01-01

    The genomic cis-regulatory systems controlling regulatory gene expression usually include multiple modules. The regulatory output of such systems at any given time depends on which module is directing the function of the basal transcription apparatus, and ultimately on the transcription factor inputs into that module. Here we examine regulation of the S. purpuratus tbrain gene, a required activator of the skeletogenic specification state in the lineage descendant from the embryo micromeres. Alternate cis-regulatory modules were found to convey skeletogenic expression in reporter constructs. To determine their relative developmental functions in context, we made use of recombineered BAC constructs containing a GFP reporter, and of derivatives from which specific modules had been deleted. The outputs of the various constructs were observed spatially by GFP fluorescence and quantitatively over time by QPCR. In the context of the complete genomic locus, early skeletogenic expression is controlled by an intron enhancer plus a proximal region containing a HesC site as predicted from network analysis. From ingression onward, however, a dedicated distal module utilizing positive Ets1/2 inputs contributes to definitive expression in the skeletogenic mesenchyme. This module also mediates a newly-discovered negative Erg input which excludes non-skeletogenic mesodermal expression. PMID:19679118

  6. Establishment of a Developmental Compartment Requires Interactions between Three Synergistic Cis-regulatory Modules

    PubMed Central

    Bieli, Dimitri; Kanca, Oguz; Requena, David; Hamaratoglu, Fisun; Gohl, Daryl; Schedl, Paul; Affolter, Markus; Slattery, Matthew; Müller, Martin; Estella, Carlos

    2015-01-01

    The subdivision of cell populations in compartments is a key event during animal development. In Drosophila, the gene apterous (ap) divides the wing imaginal disc in dorsal vs ventral cell lineages and is required for wing formation. ap function as a dorsal selector gene has been extensively studied. However, the regulation of its expression during wing development is poorly understood. In this study, we analyzed ap transcriptional regulation at the endogenous locus and identified three cis-regulatory modules (CRMs) essential for wing development. Only when the three CRMs are combined, robust ap expression is obtained. In addition, we genetically and molecularly analyzed the trans-factors that regulate these CRMs. Our results propose a three-step mechanism for the cell lineage compartment expression of ap that includes initial activation, positive autoregulation and Trithorax-mediated maintenance through separable CRMs. PMID:26468882

  7. [Identification and mapping of cis-regulatory elements within long genomic sequences].

    PubMed

    Akopov, S B; Chernov, I P; Vetchinova, A S; Bulanenkova, S S; Nikolaev, L G

    2007-01-01

    The publication of the human and other metazoan genome sequences opened up the possibility for mapping and analysis of genomic regulatory elements. Unfortunately, experimental data on genomic positions of such sequences as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. As most genomic regulatory elements (e.g., enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements in silico is often ambiguous. Therefore, the development of high-throughput experimental approaches for identification and mapping of genomic functional elements is highly desirable. In this review we discuss novel approaches to high-throughput experimental identification of mammalian genomes cis-regulatory elements which is a necessary step toward the complete genome annotation. PMID:18240562

  8. Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement.

    PubMed

    Swinnen, Gwen; Goossens, Alain; Pauwels, Laurens

    2016-06-01

    Domestication of wild plant species has provided us with crops that serve our human nutritional needs. Advanced DNA sequencing has propelled the unveiling of underlying genetic changes associated with domestication. Interestingly, many changes reside in cis-regulatory elements (CREs) that control the expression of an unmodified coding sequence. Sequence variation in CREs can impact gene expression levels, but also developmental timing and tissue specificity of expression. When genes are involved in multiple pathways or active in several organs and developmental stages CRE modifications are favored in contrast to mutations in coding regions, due to the lack of detrimental pleiotropic effects. Therefore, learning from domestication, we propose that CREs are interesting targets for genome editing to create new alleles for plant breeding. PMID:26876195

  9. Quantitative Analysis of Cis-Regulatory Element Activity Using Synthetic Promoters in Transgenic Plants.

    PubMed

    Benn, Geoffrey; Dehesh, Katayoon

    2016-01-01

    Synthetic promoters, introduced stably or transiently into plants, are an invaluable tool for the identification of functional regulatory elements and the corresponding transcription factor(s) that regulate the amplitude, spatial distribution, and temporal patterns of gene expression. Here, we present a protocol describing the steps required to identify and characterize putative cis-regulatory elements. These steps include application of computational tools to identify putative elements, construction of a synthetic promoter upstream of luciferase, identification of transcription factors that regulate the element, testing the functionality of the element introduced transiently and/or stably into the species of interest followed by high-throughput luciferase screening assays, and subsequent data processing and statistical analysis. PMID:27557758

  10. BET bromodomain inhibition releases the Mediator complex from select cis-regulatory elements

    PubMed Central

    Bhagwat, Anand S.; Roe, Jae-Seok; Mok, Beverly A.; Hohmann, Anja F.; Shi, Junwei; Vakoc, Christopher R.

    2016-01-01

    The bromodomain and extraterminal (BET) protein BRD4 can physically interact with the Mediator complex, but the relevance of this association to the therapeutic effects of BET inhibitors in cancer is unclear. Here, we show that BET inhibition causes a rapid release of Mediator from a subset of cis-regulatory elements in the genome of acute myeloid leukemia (AML) cells. These sites of Mediator eviction were highly correlated with transcriptional suppression of neighboring genes, which are enriched for targets of the transcription factor MYB and for functions related to leukemogenesis. An shRNA screen of Mediator in AML cells identified the MED12, MED13, MED23, and MED24 subunits as performing a similar regulatory function to BRD4 in this context, including a shared role in sustaining a block in myeloid maturation. These findings suggest that the interaction between BRD4 and Mediator has functional importance for gene-specific transcriptional activation and for AML maintenance. PMID:27068464

  11. Dissecting the Genetic Basis of a Complex cis-Regulatory Adaptation

    PubMed Central

    Artieri, Carlo G.; Zhang, Mian; Zhou, Yiqi; Palmer, Michael E.; Fraser, Hunter B.

    2015-01-01

    Although single genes underlying several evolutionary adaptations have been identified, the genetic basis of complex, polygenic adaptations has been far more challenging to pinpoint. Here we report that the budding yeast Saccharomyces paradoxus has recently evolved resistance to citrinin, a naturally occurring mycotoxin. Applying a genome-wide test for selection on cis-regulation, we identified five genes involved in the citrinin response that are constitutively up-regulated in S. paradoxus. Four of these genes are necessary for resistance, and are also sufficient to increase the resistance of a sensitive strain when over-expressed. Moreover, cis-regulatory divergence in the promoters of these genes contributes to resistance, while exacting a cost in the absence of citrinin. Our results demonstrate how the subtle effects of individual regulatory elements can be combined, via natural selection, into a complex adaptation. Our approach can be applied to dissect the genetic basis of polygenic adaptations in a wide range of species. PMID:26713447

  12. Evolutionarily Assembled cis-Regulatory Module at a Human Ciliopathy Locus

    PubMed Central

    Lee, Jeong Ho; Silhavy, Jennifer L.; Lee, Ji Eun; Al-Gazali, Lihadh; Thomas, Sophie; Davis, Erica E.; Bielas, Stephanie L.; Hill, Kiley J.; Iannicelli, Miriam; Brancati, Francesco; Gabriel, Stacey B.; Russ, Carsten; Logan, Clare V.; Sharif, Saghira Malik; Bennett, Christopher P.; Abe, Masumi; Hildebrandt, Friedhelm; Diplas, Bill H.; Attié-Bitach, Tania; Katsanis, Nicholas; Rajab, Anna; Koul, Roshan; Sztriha, Laszlo; Waters, Elizabeth R.; Ferro-Novick, Susan; Woods, C. Geoffrey; Johnson, Colin A.; Valente, Enza Maria; Zaki, Maha S.; Gleeson, Joseph G.

    2013-01-01

    Neighboring genes are often coordinately expressed within cis-regulatory modules, but evidence that nonparalogous genes share functions in mammals is lacking. Here, we report that mutation of either TMEM138 or TMEM216 causes a phenotypically indistinguishable human ciliopathy, Joubert syndrome. Despite a lack of sequence homology, the genes are aligned in a head-to-tail configuration and joined by chromosomal rearrangement at the amphibian-to-reptile evolutionary transition. Expression of the two genes is mediated by a conserved regulatory element in the noncoding intergenic region. Coordinated expression is important for their interdependent cellular role in vesicular transport to primary cilia. Hence, during vertebrate evolution of genes involved in ciliogenesis, nonparalogous genes were arranged to a functional gene cluster with shared regulatory elements. PMID:22282472

  13. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

    PubMed

    José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

    2015-12-01

    A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323

  14. Massively parallel cis-regulatory analysis in the mammalian central nervous system

    PubMed Central

    Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.

    2016-01-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614

  15. Identification of three new cis-regulatory IRF5 polymorphisms: in vitro studies

    PubMed Central

    2013-01-01

    Background Polymorphisms in the interferon regulatory factor 5 (IRF5) gene are associated with susceptibility to systemic lupus erythematosus, rheumatoid arthritis and other diseases through independent risk and protective haplotypes. Several functional polymorphisms are already known, but they do not account for the protective haplotypes that are tagged by the minor allele of rs729302. Methods Polymorphisms in linkage disequilibrium (LD) with rs729302 or particularly associated with IRF5 expression were selected for functional screening, which involved electrophoretic mobility shift assays (EMSAs) and reporter gene assays. Results A total of 54 single-nucleotide polymorphisms in the 5' region of IRF5 were genotyped. Twenty-four of them were selected for functional screening because of their high LD with rs729302 or protective haplotypes. In addition, two polymorphisms were selected for their prominent association with IRF5 expression. Seven of these twenty-six polymorphisms showed reproducible allele differences in EMSA. The seven were subsequently analyzed in gene reporter assays, and three of them showed significant differences between their two alleles: rs729302, rs13245639 and rs11269962. Haplotypes including the cis-regulatory polymorphisms correlated very well with IRF5 mRNA expression in an analysis based on previous data. Conclusion We have found that three polymorphisms in LD with the protective haplotypes of IRF5 have differential allele effects in EMSA and in reporter gene assays. Identification of these cis-regulatory polymorphisms will allow more accurate analysis of transcriptional regulation of IRF5 expression, more powerful genetic association studies and deeper insight into the role of IRF5 in disease susceptibility. PMID:23941291

  16. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord

    PubMed Central

    José-Edwards, Diana S.; Oda-Ishii, Izumi; Kugler, Jamie E.; Passamaneck, Yale J.; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

    2015-01-01

    A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323

  17. Quantitative comparison of cis-regulatory element (CRE) activities in transgenic Drosophila melanogaster.

    PubMed

    Rogers, William A; Williams, Thomas M

    2011-01-01

    Gene expression patterns are specified by cis-regulatory element (CRE) sequences, which are also called enhancers or cis-regulatory modules. A typical CRE possesses an arrangement of binding sites for several transcription factor proteins that confer a regulatory logic specifying when, where, and at what level the regulated gene(s) is expressed. The full set of CREs within an animal genome encodes the organism's program for development, and empirical as well as theoretical studies indicate that mutations in CREs played a prominent role in morphological evolution. Moreover, human genome wide association studies indicate that genetic variation in CREs contribute substantially to phenotypic variation. Thus, understanding regulatory logic and how mutations affect such logic is a central goal of genetics. Reporter transgenes provide a powerful method to study the in vivo function of CREs. Here a known or suspected CRE sequence is coupled to heterologous promoter and coding sequences for a reporter gene encoding an easily observable protein product. When a reporter transgene is inserted into a host organism, the CRE's activity becomes visible in the form of the encoded reporter protein. P-element mediated transgenesis in the fruit fly species Drosophila (D.) melanogaster has been used for decades to introduce reporter transgenes into this model organism, though the genomic placement of transgenes is random. Hence, reporter gene activity is strongly influenced by the local chromatin and gene environment, limiting CRE comparisons to being qualitative. In recent years, the phiC31 based integration system was adapted for use in D. melanogaster to insert transgenes into specific genome landing sites. This capability has made the quantitative measurement of gene and, relevant here, CRE activity feasible. The production of transgenic fruit flies can be outsourced, including phiC31-based integration, eliminating the need to purchase expensive equipment and/or have proficiency at

  18. Identification of a novel cis-regulatory element essential for immune tolerance.

    PubMed

    LaFlam, Taylor N; Seumois, Grégory; Miller, Corey N; Lwin, Wint; Fasano, Kayla J; Waterfield, Michael; Proekt, Irina; Vijayanand, Pandurangan; Anderson, Mark S

    2015-11-16

    Thymic central tolerance is essential to preventing autoimmunity. In medullary thymic epithelial cells (mTECs), the Autoimmune regulator (Aire) gene plays an essential role in this process by driving the expression of a diverse set of tissue-specific antigens (TSAs), which are presented and help tolerize self-reactive thymocytes. Interestingly, Aire has a highly tissue-restricted pattern of expression, with only mTECs and peripheral extrathymic Aire-expressing cells (eTACs) known to express detectable levels in adults. Despite this high level of tissue specificity, the cis-regulatory elements that control Aire expression have remained obscure. Here, we identify a highly conserved noncoding DNA element that is essential for Aire expression. This element shows enrichment of enhancer-associated histone marks in mTECs and also has characteristics of being an NF-κB-responsive element. Finally, we find that this element is essential for Aire expression in vivo and necessary to prevent spontaneous autoimmunity, reflecting the importance of this regulatory DNA element in promoting immune tolerance. PMID:26527800

  19. Distal cis-regulatory elements are required for tissue-specific expression of enamelin (Enam)

    PubMed Central

    Hu, Yuanyuan; Papagerakis, Petros; Ye, Ling; Feng, Jerry Q.; Simmer, James P.; Hu, Jan C-C.

    2009-01-01

    Enamel formation is orchestrated by the sequential expression of genes encoding enamel matrix proteins; however, the mechanisms sustaining the spatio–temporal order of gene transcription during amelogenesis are poorly understood. The aim of this study was to characterize the cis-regulatory sequences necessary for normal expression of enamelin (Enam). Several enamelin transcription regulatory regions, showing high sequence homology among species, were identified. DNA constructs containing 5.2 or 3.9 kb regions upstream of the enamelin translation initiation site were linked to a LacZ reporter and used to generate transgenic mice. Only the 5.2-Enam–LacZ construct was sufficient to recapitulate the endogenous pattern of enamelin tooth-specific expression. The 3.9-Enam–LacZ transgenic lines showed no expression in dental cells, but ectopic β-galactosidase activity was detected in osteoblasts. Potential transcription factor-binding sites were identified that may be important in controlling enamelin basal promoter activity and in conferring enamelin tissue-specific expression. Our study provides new insights into regulatory mechanisms governing enamelin expression. PMID:18353004

  20. Close sequence comparisons are sufficient to identify human cis-regulatory elements.

    PubMed

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M; Couronne, Olivier; Pennacchio, Len A

    2006-07-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points. PMID:16769978

  1. Subfunctionalization of Duplicated Zebrafish pax6 Genes by cis-Regulatory Divergence

    PubMed Central

    Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-01-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3′) control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  2. Subfunctionalization of duplicated zebrafish pax6 genes by cis-regulatory divergence.

    PubMed

    Kleinjan, Dirk A; Bancewicz, Ruth M; Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-02-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3') control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  3. Genetic Analysis of Transvection Effects Involving Cis-Regulatory Elements of the Drosophila Ultrabithorax Gene

    PubMed Central

    Micol, J. L.; Castelli-Gair, J. E.; Garcia-Bellido, A.

    1990-01-01

    The Ultrabithorax (Ubx) gene of Drosophila melanogaster contains two functionally distinguishable regions: the protein-coding Ubx transcription unit and, upstream of it, the transcribed but non-protein-coding bxd region. Numerous recessive, partial loss-of-function mutations which appear to be regulatory mutations map within the bxd region and within the introns of the Ubx transcription unit. In addition, mutations within the Ubx unit exons are known and most of these behave as null alleles. Ubx(1) is one such allele. We have confirmed that, although the Ubx(1) allele does not produce detectable Ubx proteins (UBX), it does retain other genetic functions detectable by their effects on the expression of a paired, homologous Ubx allele, i.e., by transvection. We have extended previous analyses made by E. B. Lewis by mapping the critical elements of the Ubx gene which participate in transvection effects. Our results show that the Ubx(1) allele retains wild-type functions whose effectiveness can be reduced (1) by additional cis mutations in the bxd region or in introns of the Ubx transcription unit, as well as (2) by rearrangements disturbing pairing between homologous Ubx genes. Our results suggest that those remnant functions in Ubx(1) are able to modulate the activity of the allele located in the homologous chromosome. We discuss the normal cis regulatory role of these functions involved in trans interactions between homologous Ubx genes, as well as the implications of our results for the current models on transvection. PMID:2123161

  4. Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints

    PubMed Central

    Irimia, Manuel; Tena, Juan J.; Alexis, Maria S.; Fernandez-Miñan, Ana; Maeso, Ignacio; Bogdanović, Ozren; de la Calle-Mustienes, Elisa; Roy, Scott W.; Gómez-Skarmeta, José L.; Fraser, Hunter B.

    2012-01-01

    The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought—involving ∼12% of the ancestral bilaterian genome—and that cis-regulatory constraints are crucial in determining metazoan genome architecture. PMID:22722344

  5. Motif-directed flexible backbone design of functional interactions

    PubMed Central

    Havranek, James J; Baker, David

    2009-01-01

    Computational protein design relies on a number of approximations to efficiently search the huge sequence space available to proteins. The fixed backbone and rotamer approximations in particular are important for formulating protein design as a discrete combinatorial optimization problem. However, the resulting coarse-grained sampling of possible side-chain terminal positions is problematic for the design of protein function, which depends on precise positioning of side-chain atoms. Although backbone flexibility can greatly increase the conformation freedom of side-chain functional groups, it is not obvious which backbone movements will generate the critical constellation of atoms responsible for protein function. Here, we report an automated method for identifying protein backbone movements that can give rise to any specified set of desired side-chain atomic placements and interactions, using protein–DNA interfaces as a model system. We use a library of previously observed protein–DNA interactions (motifs) and a rotamer-based description of side-chain conformation freedom to identify placements for the protein backbone that can give rise to a favorable side-chain interaction with DNA. We describe a tree-search algorithm for identifying those combinations of interactions from the library that can be realized with minimal perturbation of the protein backbone. We compare the efficiency of this method with the alternative approach of building and screening alternate backbone conformations. PMID:19472357

  6. Single nucleotide polymorphisms with cis-regulatory effects on long non-coding transcripts in human primary monocytes.

    PubMed

    Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Pastinen, Tomi; Goodall, Alison H; Cambien, François; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine

    2014-01-01

    We applied genome-wide allele-specific expression analysis of monocytes from 188 samples. Monocytes were purified from white blood cells of healthy blood donors to detect cis-acting genetic variation that regulates the expression of long non-coding RNAs. We analysed 8929 regions harboring genes for potential long non-coding RNA that were retrieved from data from the ENCODE project. Of these regions, 60% were annotated as intergenic, which implies that they do not overlap with protein-coding genes. Focusing on the intergenic regions, and using stringent analysis of the allele-specific expression data, we detected robust cis-regulatory SNPs in 258 out of 489 informative intergenic regions included in the analysis. The cis-regulatory SNPs that were significantly associated with allele-specific expression of long non-coding RNAs were enriched to enhancer regions marked for active or bivalent, poised chromatin by histone modifications. Out of the lncRNA regions regulated by cis-acting regulatory SNPs, 20% (n = 52) were co-regulated with the closest protein coding gene. We compared the identified cis-regulatory SNPs with those in the catalog of SNPs identified by genome-wide association studies of human diseases and traits. This comparison identified 32 SNPs in loci from genome-wide association studies that displayed a strong association signal with allele-specific expression of non-coding RNAs in monocytes, with p-values ranging from 6.7×10(-7) to 9.5×10(-89). The identified cis-regulatory SNPs are associated with diseases of the immune system, like multiple sclerosis and rheumatoid arthritis. PMID:25025429

  7. Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

    PubMed

    Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

    2015-08-15

    Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan. PMID:26220938

  8. Mapping Association between Long-Range Cis-Regulatory Regions and Their Target Genes Using Comparative Genomics

    NASA Astrophysics Data System (ADS)

    Mongin, Emmanuel; Dewar, Ken; Blanchette, Mathieu

    In chordates, long-range cis-regulatory regions are involved in the control of transcription initiation (either as repressors or enhancers). They can be located as far as 1 Mb from the transcription start site of the target gene and can regulate more than one gene. Therefore, proper characterization of functional interactions between long-range cis-regulatory regions and their target genes remains problematic. We present a novel method to predict such interactions based on the analysis of rearrangements between the human and 16 other vertebrate genomes. Our method is based on the assumption that genome rearrangements that would disrupt the functional interaction between a cis-regulatory region and its target gene are likely to be deleterious. Therefore, conservation of synteny through evolution would be an indication of a functional interaction. We use our algorithm to classify a set of 1,406,084 putative associations from the human genome. This genome-wide map of interactions has many potential applications, including the selection of candidate regions prior to in vivo experimental characterization, a better characterization of regulatory regions involved in position effect diseases, and an improved understanding of the mechanisms and importance of long-range regulation.

  9. Balanced polymorphism in bottlenecked populations: the case of the CCR5 5' cis-regulatory region in Amazonian Amerindians.

    PubMed

    Ramalho, Rodrigo F; Santos, Eduardo J M; Guerreiro, João F; Meyer, Diogo

    2010-09-01

    The 5' cis-regulatory region of the CCR5 gene exhibits a strong signature of balancing selection in several human populations. Here we analyze the polymorphism of this region in Amerindians from Amazonia, who have a complex demographic history, including recent bottlenecks that are known to reduce genetic variability. Amerindians show high nucleotide diversity (pi = 0.27%) and significantly positive Tajima's D, and carry haplotypes associated with weak and strong gene expression. To evaluate whether these signatures of balancing selection could be explained by demography, we perform neutrality tests based on empiric and simulated data. The observed Tajima's D was higher than that of other world populations; higher than that found for 18 noncoding regions of South Amerindians, and higher than 99.6% of simulated genealogies, which assume nonequilibrium conditions. Moreover, comparing Amerindians and Asians, the Fst for CCR5 cis-regulatory region was unusually low, in relation to neutral markers. These findings indicate that, despite their complex demographic history, South Amerindians carry a detectable signature of selection on the CCR5 cis-regulatory region. PMID:20538030

  10. Conserved Cis-Regulatory Modules Control Robustness in Msx1 Expression at Single-Cell Resolution

    PubMed Central

    Vance, Keith W.; Woodcock, Dan J.; Reid, John E.; Bretschneider, Till; Ott, Sascha; Koentges, Georgy

    2015-01-01

    The process of transcription is highly stochastic leading to cell-to-cell variations and noise in gene expression levels. However, key essential genes have to be precisely expressed at the correct amount and time to ensure proper cellular development and function. Studies in yeast and bacterial systems have shown that gene expression noise decreases as mean expression levels increase, a relationship that is controlled by promoter DNA sequence. However, the function of distal cis-regulatory modules (CRMs), an evolutionary novelty of metazoans, in controlling transcriptional robustness and variability is poorly understood. In this study, we used live cell imaging of transfected reporters combined with a mathematical modelling and statistical inference scheme to quantify the function of conserved Msx1 CRMs and promoters in modulating single-cell real-time transcription rates in C2C12 mouse myoblasts. The results show that the mean expression–noise relationship is solely promoter controlled for this key pluripotency regulator. In addition, we demonstrate that CRMs modulate single-cell basal promoter rate distributions in a graded manner across a population of cells. This extends the rheostatic model of CRM action to provide a more detailed understanding of CRM function at single-cell resolution. We also identify a novel CRM transcriptional filter function that acts to reduce intracellular variability in transcription rates and show that this can be phylogenetically separable from rate modulating CRM activities. These results are important for understanding how the expression of key vertebrate developmental transcription factors is precisely controlled both within and between individual cells. PMID:26342140

  11. Conserved Cis-Regulatory Modules Control Robustness in Msx1 Expression at Single-Cell Resolution.

    PubMed

    Vance, Keith W; Woodcock, Dan J; Reid, John E; Bretschneider, Till; Ott, Sascha; Koentges, Georgy

    2015-09-01

    The process of transcription is highly stochastic leading to cell-to-cell variations and noise in gene expression levels. However, key essential genes have to be precisely expressed at the correct amount and time to ensure proper cellular development and function. Studies in yeast and bacterial systems have shown that gene expression noise decreases as mean expression levels increase, a relationship that is controlled by promoter DNA sequence. However, the function of distal cis-regulatory modules (CRMs), an evolutionary novelty of metazoans, in controlling transcriptional robustness and variability is poorly understood. In this study, we used live cell imaging of transfected reporters combined with a mathematical modelling and statistical inference scheme to quantify the function of conserved Msx1 CRMs and promoters in modulating single-cell real-time transcription rates in C2C12 mouse myoblasts. The results show that the mean expression-noise relationship is solely promoter controlled for this key pluripotency regulator. In addition, we demonstrate that CRMs modulate single-cell basal promoter rate distributions in a graded manner across a population of cells. This extends the rheostatic model of CRM action to provide a more detailed understanding of CRM function at single-cell resolution. We also identify a novel CRM transcriptional filter function that acts to reduce intracellular variability in transcription rates and show that this can be phylogenetically separable from rate modulating CRM activities. These results are important for understanding how the expression of key vertebrate developmental transcription factors is precisely controlled both within and between individual cells. PMID:26342140

  12. Shuffling of cis-regulatory elements is a pervasive feature of the vertebrate lineage

    PubMed Central

    Sanges, Remo; Kalmar, Eva; Claudiani, Pamela; D'Amato, Maria; Muller, Ferenc; Stupka, Elia

    2006-01-01

    Background All vertebrates share a remarkable degree of similarity in their development as well as in the basic functions of their cells. Despite this, attempts at unearthing genome-wide regulatory elements conserved throughout the vertebrate lineage using BLAST-like approaches have thus far detected noncoding conservation in only a few hundred genes, mostly associated with regulation of transcription and development. Results We used a unique combination of tools to obtain regional global-local alignments of orthologous loci. This approach takes into account shuffling of regulatory regions that are likely to occur over evolutionary distances greater than those separating mammalian genomes. This approach revealed one order of magnitude more vertebrate conserved elements than was previously reported in over 2,000 genes, including a high number of genes found in the membrane and extracellular regions. Our analysis revealed that 72% of the elements identified have undergone shuffling. We tested the ability of the elements identified to enhance transcription in zebrafish embryos and compared their activity with a set of control fragments. We found that more than 80% of the elements tested were able to enhance transcription significantly, prevalently in a tissue-restricted manner corresponding to the expression domain of the neighboring gene. Conclusion Our work elucidates the importance of shuffling in the detection of cis-regulatory elements. It also elucidates how similarities across the vertebrate lineage, which go well beyond development, can be explained not only within the realm of coding genes but also in that of the sequences that ultimately govern their expression. PMID:16859531

  13. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition.

    PubMed

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  14. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models

    PubMed Central

    Svetlichnyy, Dmitry; Imrichova, Hana; Fiers, Mark; Kalender Atak, Zeynep; Aerts, Stein

    2015-01-01

    Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a “gain-of-target” for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes. PMID:26562774

  15. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models.

    PubMed

    Svetlichnyy, Dmitry; Imrichova, Hana; Fiers, Mark; Kalender Atak, Zeynep; Aerts, Stein

    2015-11-01

    Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a "gain-of-target" for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes. PMID:26562774

  16. Regulation of human PTCH1b expression by different 5' untranslated region cis-regulatory elements

    PubMed Central

    Ozretić, Petar; Bisio, Alessandra; Musani, Vesna; Trnski, Diana; Sabol, Maja; Levanat, Sonja; Inga, Alberto

    2015-01-01

    PTCH1 gene codes for a 12-pass transmembrane receptor with a negative regulatory role in the Hedgehog-Gli signaling pathway. PTCH1 germline mutations cause Gorlin syndrome, a disorder characterized by developmental abnormalities and tumor susceptibility. The autosomal dominant inheritance, and the evidence for PTCH1 haploinsufficiency, suggests that fine-tuning systems of protein patched homolog 1 (PTC1) levels exist to properly regulate the pathway. Given the role of 5' untranslated region (5'UTR) in protein expression, our aim was to thoroughly explore cis-regulatory elements in the 5'UTR of PTCH1 transcript 1b. The (CGG)n polymorphism was the main potential regulatory element studied so far but with inconsistent results and no clear association between repeat number and disease risk. Using luciferase reporter constructs in human cell lines here we show that the number of CGG repeats has no strong impact on gene expression, both at mRNA and protein levels. We observed variability in the length of 5'UTR and changes in abundance of the associated transcripts after pathway activation. We show that upstream AUG codons (uAUGs) present only in longer 5'UTRs could negatively regulate the amount of PTC1 isoform L (PTC1-L). The existence of an internal ribosome entry site (IRES) observed using different approaches and mapped in the region comprising the CGG repeats, would counteract the effect of the uAUGs and enable synthesis of PTC1-L under stressful conditions, such as during hypoxia. Higher relative translation efficiency of PTCH1b mRNA in HEK 293T cultured hypoxia was observed by polysomal profiling and Western blot analyses. All our results point to an exceptionally complex and so far unexplored role of 5'UTR PTCH1b cis-element features in the regulation of the Hedgehog-Gli signaling pathway. PMID:25826662

  17. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition

    PubMed Central

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  18. Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

    DOE PAGESBeta

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-03-26

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less

  19. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii.

    PubMed

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  20. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii

    PubMed Central

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  1. Numb directs the subcellular localization of EAAT3 through binding the YxNxxF motif.

    PubMed

    Su, Jin-Feng; Wei, Jian; Li, Pei-Shan; Miao, Hong-Hua; Ma, Yong-Chao; Qu, Yu-Xiu; Xu, Jie; Qin, Jie; Li, Bo-Liang; Song, Bao-Liang; Xu, Zheng-Ping; Luo, Jie

    2016-08-15

    Excitatory amino acid transporter type 3 (EAAT3, also known as SLC1A1) is a high-affinity, Na(+)-dependent glutamate carrier that localizes primarily within the cell and at the apical plasma membrane. Although previous studies have reported proteins and sequence regions involved in EAAT3 trafficking, the detailed molecular mechanism by which EAAT3 is distributed to the correct location still remains elusive. Here, we identify that the YVNGGF sequence in the C-terminus of EAAT3 is responsible for its intracellular localization and apical sorting in rat hepatoma cells CRL1601 and Madin-Darby canine kidney (MDCK) cells, respectively. We further demonstrate that Numb, a clathrin adaptor protein, directly binds the YVNGGF motif and regulates the localization of EAAT3. Mutation of Y503, N505 and F508 within the YVNGGF motif to alanine residues or silencing Numb by use of small interfering RNA (siRNA) results in the aberrant localization of EAAT3. Moreover, both Numb and the YVNGGF motif mediate EAAT3 endocytosis in CRL1601 cells. In summary, our study suggests that Numb is a pivotal adaptor protein that mediates the subcellular localization of EAAT3 through binding the YxNxxF (where x stands for any amino acid) motif. PMID:27358480

  2. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  3. ChIP-Seq-Annotated Heliconius erato Genome Highlights Patterns of cis-Regulatory Evolution in Lepidoptera.

    PubMed

    Lewis, James J; van der Burg, Karin R L; Mazo-Vargas, Anyi; Reed, Robert D

    2016-09-13

    Uncovering phylogenetic patterns of cis-regulatory evolution remains a fundamental goal for evolutionary and developmental biology. Here, we characterize the evolution of regulatory loci in butterflies and moths using chromatin immunoprecipitation sequencing (ChIP-seq) annotation of regulatory elements across three stages of head development. In the process we provide a high-quality, functionally annotated genome assembly for the butterfly, Heliconius erato. Comparing cis-regulatory element conservation across six lepidopteran genomes, we find that regulatory sequences evolve at a pace similar to that of protein-coding regions. We also observe that elements active at multiple developmental stages are markedly more conserved than elements with stage-specific activity. Surprisingly, we also find that stage-specific proximal and distal regulatory elements evolve at nearly identical rates. Our study provides a benchmark for genome-wide patterns of regulatory element evolution in insects, and it shows that developmental timing of activity strongly predicts patterns of regulatory sequence evolution. PMID:27626657

  4. An ancient yet flexible cis-regulatory architecture allows localized Hedgehog tuning by patched/Ptch1

    PubMed Central

    Lorberbaum, David S; Ramos, Andrea I; Peterson, Kevin A; Carpenter, Brandon S; Parker, David S; De, Sandip; Hillers, Lauren E; Blake, Victoria M; Nishi, Yuichi; McFarlane, Matthew R; Chiang, Ason CY; Kassis, Judith A; Allen, Benjamin L; McMahon, Andrew P; Barolo, Scott

    2016-01-01

    The Hedgehog signaling pathway is part of the ancient developmental-evolutionary animal toolkit. Frequently co-opted to pattern new structures, the pathway is conserved among eumetazoans yet flexible and pleiotropic in its effects. The Hedgehog receptor, Patched, is transcriptionally activated by Hedgehog, providing essential negative feedback in all tissues. Our locus-wide dissections of the cis-regulatory landscapes of fly patched and mouse Ptch1 reveal abundant, diverse enhancers with stage- and tissue-specific expression patterns. The seemingly simple, constitutive Hedgehog response of patched/Ptch1 is driven by a complex regulatory architecture, with batteries of context-specific enhancers engaged in promoter-specific interactions to tune signaling individually in each tissue, without disturbing patterning elsewhere. This structure—one of the oldest cis-regulatory features discovered in animal genomes—explains how patched/Ptch1 can drive dramatic adaptations in animal morphology while maintaining its essential core function. It may also suggest a general model for the evolutionary flexibility of conserved regulators and pathways. DOI: http://dx.doi.org/10.7554/eLife.13550.001 PMID:27146892

  5. Differential contribution of cis-regulatory elements to higher order chromatin structure and expression of the CFTR locus.

    PubMed

    Yang, Rui; Kerschner, Jenny L; Gosalia, Nehal; Neems, Daniel; Gorsic, Lidija K; Safi, Alexias; Crawford, Gregory E; Kosak, Steven T; Leir, Shih-Hsing; Harris, Ann

    2016-04-20

    Higher order chromatin structure establishes domains that organize the genome and coordinate gene expression. However, the molecular mechanisms controlling transcription of individual loci within a topological domain (TAD) are not fully understood. The cystic fibrosis transmembrane conductance regulator (CFTR) gene provides a paradigm for investigating these mechanisms.CFTR occupies a TAD bordered by CTCF/cohesin binding sites within which are cell-type-selective cis-regulatory elements for the locus. We showed previously that intronic and extragenic enhancers, when occupied by specific transcription factors, are recruited to the CFTR promoter by a looping mechanism to drive gene expression. Here we use a combination of CRISPR/Cas9 editing of cis-regulatory elements and siRNA-mediated depletion of architectural proteins to determine the relative contribution of structural elements and enhancers to the higher order structure and expression of the CFTR locus. We found the boundaries of the CFTRTAD are conserved among diverse cell types and are dependent on CTCF and cohesin complex. Removal of an upstream CTCF-binding insulator alters the interaction profile, but has little effect on CFTR expression. Within the TAD, intronic enhancers recruit cell-type selective transcription factors and deletion of a pivotal enhancer element dramatically decreases CFTR expression, but has minor effect on its 3D structure. PMID:26673704

  6. FootprintDB: Analysis of Plant Cis-Regulatory Elements, Transcription Factors, and Binding Interfaces.

    PubMed

    Contreras-Moreira, Bruno; Sebastian, Alvaro

    2016-01-01

    FootprintDB is a database and search engine that compiles regulatory sequences from open access libraries of curated DNA cis-elements and motifs, and their associated transcription factors (TFs). It systematically annotates the binding interfaces of the TFs by exploiting protein-DNA complexes deposited in the Protein Data Bank. Each entry in footprintDB is thus a DNA motif linked to the protein sequence of the TF(s) known to recognize it, and in most cases, the set of predicted interface residues involved in specific recognition. This chapter explains step-by-step how to search for DNA motifs and protein sequences in footprintDB and how to focus the search to a particular organism. Two real-world examples are shown where this software was used to analyze transcriptional regulation in plants. Results are described with the aim of guiding users on their interpretation, and special attention is given to the choices users might face when performing similar analyses. PMID:27557773

  7. Combinatorial regulation modules on GmSBP2 promoter: a distal cis-regulatory domain confines the SBP2 promoter activity to the vascular tissue in vegetative organs.

    PubMed

    Waclawovsky, Alessandro J; Freitas, Rejane L; Rocha, Carolina S; Contim, Luis Antônio S; Fontes, Elizabeth P B

    2006-01-01

    The Glycine max sucrose binding protein (GmSBP2) promoter directs phloem-specific expression of reporter genes in transgenic tobacco. Here, we identified cis-regulatory domains (CRD) that contribute with positive and negative regulation for the tissue-specific pattern of the GmSPB2 promoter. Negative regulatory elements in the distal CRD-A (-2000 to -700) sequences suppressed expression from the GmSBP2 promoter in tissues other than seed tissues and vascular tissues of vegetative organs. Deletion of this region relieved repression resulting in a constitutive promoter highly active in all tissues analyzed. Further deletions from the strong constitutive -700GmSBP2 promoter delimited several intercalating enhancer-like and repressing domains that function in a context-dependent manner. Histochemical examination revealed that the CRD-C (-445 to -367) harbors both negative and positive elements. This region abolished promoter expression in roots and in all tissues of stems except for the inner phloem. In contrast, it restores root meristem expression when fused to the -132pSBP2-GUS construct, which contains root meristem expression-repressing determinants mapped to the 44-bp CRD-G (-136 to -92). Thus, the GmSBP2 promoter is functionally organized into a proximal region with the combinatorial modular configuration of plant promoters and a distal domain, which restricts gene expression to the vascular tissues in vegetative organs. PMID:16574256

  8. A phylogenetic Gibbs sampler that yields centroid solutions of cis-regulatory sites

    SciTech Connect

    Newberg, Lee A.; Thompson, William A.; Conlan, Sean; Smith, Thomas M.; McCue, Lee Ann; Lawrence, Charles E.

    2007-07-15

    Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs, however the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimization techniques are limited to the detection of a single alignment that may not be representative. Comparative-genomics studies must be able to address the phylogenetic correlation in the data and efficiently explore the alignment space, in order to make specific and biologically relevant predictions. Results: We describe here a Gibbs sampler that employs a full phylogenetic model and reports an ensemble centroid solution. We describe regulatory motif detection using both simulated and real data, and demonstrate that this approach achieves improved specificity, sensitivity, and positive predictive value over non-phylogenetic algorithms, and over phylogenetic algorithms that report a maximum likelihood solution.

  9. Correlating Gene Expression Variation with cis-Regulatory Polymorphism in Saccharomyces cerevisiae

    PubMed Central

    Chen, Kevin; van Nimwegen, Erik; Rajewsky, Nikolaus; Siegal, Mark L.

    2010-01-01

    Identifying the nucleotides that cause gene expression variation is a critical step in dissecting the genetic basis of complex traits. Here, we focus on polymorphisms that are predicted to alter transcription factor binding sites (TFBSs) in the yeast, Saccharomyces cerevisiae. We assembled a confident set of transcription factor motifs using recent protein binding microarray and ChIP-chip data and used our collection of motifs to predict a comprehensive set of TFBSs across the S. cerevisiae genome. We used a population genomics analysis to show that our predictions are accurate and significantly improve on our previous annotation. Although predicting gene expression from sequence is thought to be difficult in general, we identified a subset of genes for which changes in predicted TFBSs correlate well with expression divergence between yeast strains. Our analysis thus demonstrates both the accuracy of our new TFBS predictions and the feasibility of using simple models of gene regulation to causally link differences in gene expression to variation at individual nucleotides. PMID:20829281

  10. Two RNA-binding motifs in eIF3 direct HCV IRES-dependent translation

    PubMed Central

    Sun, Chaomin; Querol-Audí, Jordi; Mortimer, Stefanie A.; Arias-Palomo, Ernesto; Doudna, Jennifer A.; Nogales, Eva; Cate, Jamie H. D.

    2013-01-01

    The initiation of protein synthesis plays an essential regulatory role in human biology. At the center of the initiation pathway, the 13-subunit eukaryotic translation initiation factor 3 (eIF3) controls access of other initiation factors and mRNA to the ribosome by unknown mechanisms. Using electron microscopy (EM), bioinformatics and biochemical experiments, we identify two highly conserved RNA-binding motifs in eIF3 that direct translation initiation from the hepatitis C virus internal ribosome entry site (HCV IRES) RNA. Mutations in the RNA-binding motif of subunit eIF3a weaken eIF3 binding to the HCV IRES and the 40S ribosomal subunit, thereby suppressing eIF2-dependent recognition of the start codon. Mutations in the eIF3c RNA-binding motif also reduce 40S ribosomal subunit binding to eIF3, and inhibit eIF5B-dependent steps downstream of start codon recognition. These results provide the first connection between the structure of the central translation initiation factor eIF3 and recognition of the HCV genomic RNA start codon, molecular interactions that likely extend to the human transcriptome. PMID:23766293

  11. Multiple cis Regulatory Elements Control RANTES Promoter Activity in Alveolar Epithelial Cells Infected with Respiratory Syncytial Virus

    PubMed Central

    Casola, Antonella; Garofalo, Roberto P.; Haeberle, Helene; Elliott, Todd F.; Lin, Rongtuan; Jamaluddin, Mohammad; Brasier, Allan R.

    2001-01-01

    Respiratory syncytial virus (RSV) produces intense pulmonary inflammation, in part through its ability to induce chemokine synthesis in infected airway epithelial cells. RANTES (regulated upon activation, normally T-cell expressed and presumably secreted) is a CC chemokine which recruits and activates monocytes, lymphocytes, and eosinophils, all cell types present in the lung inflammatory infiltrate induced by RSV infection. In this study, we analyzed the mechanism of RSV-induced RANTES promoter activation in human type II alveolar epithelial cells (A549 cells). Promoter deletion and mutagenesis experiments indicate that RSV requires the presence of five different cis regulatory elements, located in the promoter fragment spanning from −220 to +55 nucleotides, corresponding to NF-κB, C/EBP, Jun/CREB/ATF, and interferon regulatory factor (IRF) binding sites. Although site mutations of the NF-κB, C/EBP, and CREB/AP-1 like sites reduce RSV-induced RANTES gene transcription to 50% or less, only mutations affecting IRF binding completely abolish RANTES inducibility. Supershift and microaffinity isolation assays were used to identify the different transcription factor family members whose DNA binding activity was RSV inducible. Expression of dominant negative mutants of these transcription factors further established their central role in virus-induced RANTES promoter activation. Our finding that the presence of multiple cis regulatory elements is required for full activation of the RANTES promoter in RSV-infected alveolar epithelial cells supports the enhanceosome model for RANTES gene transcription, which is absolutely dependent on binding of IRF transcription factors. The identification of regulatory mechanisms of RANTES gene expression is fundamental for rational design of inhibitors of RSV-induced lung inflammation. PMID:11413310

  12. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  13. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  14. Separate elements of the TERMINAL FLOWER 1 cis-regulatory region integrate pathways to control flowering time and shoot meristem identity.

    PubMed

    Serrano-Mislata, Antonio; Fernández-Nohales, Pedro; Doménech, María J; Hanzawa, Yoshie; Bradley, Desmond; Madueño, Francisco

    2016-09-15

    TERMINAL FLOWER 1 (TFL1) is a key regulator of Arabidopsis plant architecture that responds to developmental and environmental signals to control flowering time and the fate of shoot meristems. TFL1 expression is dynamic, being found in all shoot meristems, but not in floral meristems, with the level and distribution changing throughout development. Using a variety of experimental approaches we have analysed the TFL1 promoter to elucidate its functional structure. TFL1 expression is based on distinct cis-regulatory regions, the most important being located 3' of the coding sequence. Our results indicate that TFL1 expression in the shoot apical versus lateral inflorescence meristems is controlled through distinct cis-regulatory elements, suggesting that different signals control expression in these meristem types. Moreover, we identified a cis-regulatory region necessary for TFL1 expression in the vegetative shoot and required for a wild-type flowering time, supporting that TFL1 expression in the vegetative meristem controls flowering time. Our study provides a model for the functional organisation of TFL1 cis-regulatory regions, contributing to our understanding of how developmental pathways are integrated at the genomic level of a key regulator to control plant architecture. PMID:27385013

  15. Modular cis-regulatory organization of developmentally expressed genes: two genes transcribed territorially in the sea urchin embryo, and additional examples.

    PubMed Central

    Kirchhamer, C V; Yuh, C H; Davidson, E H

    1996-01-01

    The cis-regulatory systems that control developmental expression of two sea urchin genes have been subjected to detailed functional analysis. Both systems are modular in organization: specific, separable fragments of the cis-regulatory DNA each containing multiple transcription factor target sites execute particular regulatory subfunctions when associated with reporter genes and introduced into the embryo. The studies summarized here were carried out on the CyIIIa gene, expressed in the embryonic aboral ectoderm and on the Endo16 gene, expressed in the embryonic vegetal plate, archenteron, and then midgut. The regulatory systems of both genes include modules that control particular aspects of temporal and spatial expression, and in both the territorial boundaries of expression depend on a combination of negative and positive functions. In both genes different regulatory modules control early and late embryonic expression. Modular cis-regulatory organization is widespread in developmentally regulated genes, and we present a tabular summary that includes many examples from mouse and Drosophila. We regard cis-regulatory modules as units of developmental transcription control, and also of evolution, in the assembly of transcription control systems. Images Fig. 2 PMID:8790328

  16. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements

    PubMed Central

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-01-01

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. PMID:25897113

  17. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements.

    PubMed

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-05-26

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. PMID:25897113

  18. Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1

    PubMed Central

    Rombauts, Stephane; Florquin, Kobe; Lescot, Magali; Marchal, Kathleen; Rouzé, Pierre; Van de Peer, Yves

    2003-01-01

    The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called “search by signal” methods) and the delineation of promoters by considering both sequence content and structural features (“search by content” methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5′-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be

  19. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

    PubMed Central

    Karnik, Rahul; Beer, Michael A.

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  20. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

    PubMed

    Karnik, Rahul; Beer, Michael A

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  1. Cis-regulatory Changes at FLOWERING LOCUS T Mediate Natural Variation in Flowering Responses of Arabidopsis thaliana

    PubMed Central

    Schwartz, Christopher; Balasubramanian, Sureshkumar; Warthmann, Norman; Michael, Todd P.; Lempe, Janne; Sureshkumar, Sridevi; Kobayashi, Yasushi; Maloof, Julin N.; Borevitz, Justin O.; Chory, Joanne; Weigel, Detlef

    2009-01-01

    Flowering time, a critical adaptive trait, is modulated by several environmental cues. These external signals converge on a small set of genes that in turn mediate the flowering response. Mutant analysis and subsequent molecular studies have revealed that one of these integrator genes, FLOWERING LOCUS T (FT), responds to photoperiod and temperature cues, two environmental parameters that greatly influence flowering time. As the central player in the transition to flowering, the protein coding sequence of FT and its function are highly conserved across species. Using QTL mapping with a new advanced intercross-recombinant inbred line (AI-RIL) population, we show that a QTL tightly linked to FT contributes to natural variation in the flowering response to the combined effects of photoperiod and ambient temperature. Using heterogeneous inbred families (HIF) and introgression lines, we fine map the QTL to a 6.7 kb fragment in the FT promoter. We confirm by quantitative complementation that FT has differential activity in the two parental strains. Further support for FT underlying the QTL comes from a new approach, quantitative knockdown with artificial microRNAs (amiRNAs). Consistent with the causal sequence polymorphism being in the promoter, we find that the QTL affects FT expression. Taken together, these results indicate that allelic variation at pathway integrator genes such as FT can underlie phenotypic variability and that this may be achieved through cis-regulatory changes. PMID:19652183

  2. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period

    PubMed Central

    Tena, Juan J.; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W.; Rigby, Peter W.J.; Carvajal, Jaime J.; Wittbrodt, Joachim; Gómez-Skarmeta, José L.; Martínez-Morales, Juan R.

    2014-01-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer’s formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115–200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan. PMID:24709821

  3. Extensive cis-Regulatory Variation Robust to Environmental Perturbation in Arabidopsis[W

    PubMed Central

    Cubillos, Francisco A.; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle

    2014-01-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments. PMID:25428981

  4. Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice

    PubMed Central

    Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun

    2015-01-01

    As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice. PMID:26655679

  5. i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly.

    PubMed

    Imrichová, Hana; Hulselmans, Gert; Atak, Zeynep Kalender; Potier, Delphine; Aerts, Stein

    2015-07-01

    i-cisTarget is a web tool to predict regulators of a set of genomic regions, such as ChIP-seq peaks or co-regulated/similar enhancers. i-cisTarget can also be used to identify upstream regulators and their target enhancers starting from a set of co-expressed genes. Whereas the original version of i-cisTarget was focused on Drosophila data, the 2015 update also provides support for human and mouse data. i-cisTarget detects transcription factor motifs (position weight matrices) and experimental data tracks (e.g. from ENCODE, Roadmap Epigenomics) that are enriched in the input set of regions. As experimental data tracks we include transcription factor ChIP-seq data, histone modification ChIP-seq data and open chromatin data. The underlying processing method is based on a ranking-and-recovery procedure, allowing accurate determination of enrichment across heterogeneous datasets, while also discriminating direct from indirect target regions through a 'leading edge' analysis. We illustrate i-cisTarget on various Ewing sarcoma datasets to identify EWS-FLI1 targets starting from ChIP-seq, differential ATAC-seq, differential H3K27ac and differential gene expression data. Use of i-cisTarget is free and open to all, and there is no login requirement. Address: http://gbiomed.kuleuven.be/apps/lcb/i-cisTarget. PMID:25925574

  6. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  7. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus

    PubMed Central

    Damle, Sagar; Davidson, Eric H.

    2011-01-01

    Deployment of the gene regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an autorepressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  8. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  9. Cis-Regulatory Elements Determine Germline Specificity and Expression Level of an Isopentenyltransferase Gene in Sperm Cells of Arabidopsis.

    PubMed

    Zhang, Jinghua; Yuan, Tong; Duan, Xiaomeng; Wei, Xiaoping; Shi, Tao; Li, Jia; Russell, Scott D; Gou, Xiaoping

    2016-03-01

    Flowering plant sperm cells transcribe a divergent and complex complement of genes. To examine promoter function, we chose an isopentenyltransferase gene known as PzIPT1. This gene is highly selectively transcribed in one sperm cell morphotype of Plumbago zeylanica, which preferentially fuses with the central cell during fertilization and is thus a founding cell of the primary endosperm. In transgenic Arabidopsis (Arabidopsis thaliana), PzIPT1 promoter displays activity in both sperm cells and upon progressive promoter truncation from the 5'-end results in a progressive decrease in reporter production, consistent with occurrence of multiple enhancer sites. Cytokinin-dependent protein binding motifs are identified in the promoter sequence, which respond with stimulation by cytokinin. Expression of PzIPT1 promoter in sperm cells confers specificity independently of previously reported Germline Restrictive Silencer Factor binding sequence. Instead, a cis-acting regulatory region consisting of two duplicated 6-bp Male Gamete Selective Activation (MGSA) motifs occurs near the site of transcription initiation. Disruption of this sequence-specific site inactivates expression of a GFP reporter gene in sperm cells. Multiple copies of the MGSA motif fused with the minimal CaMV35S promoter elements confer reporter gene expression in sperm cells. Similar duplicated MGSA motifs are also identified from promoter sequences of sperm cell-expressed genes in Arabidopsis, suggesting selective activation is possibly a common mechanism for regulation of gene expression in sperm cells of flowering plants. PMID:26739233

  10. RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay

    PubMed Central

    Dean, Kimberly M.; Grayhack, Elizabeth J.

    2012-01-01

    We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression. PMID:23097427

  11. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design. PMID:26833485

  12. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  13. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  14. Mapping gene regulatory networks in Drosophila eye development by large-scale transcriptome perturbations and motif inference.

    PubMed

    Potier, Delphine; Davie, Kristofer; Hulselmans, Gert; Naval Sanchez, Marina; Haagen, Lotte; Huynh-Thu, Vân Anh; Koldere, Duygu; Celik, Arzu; Geurts, Pierre; Christiaens, Valerie; Aerts, Stein

    2014-12-24

    Genome control is operated by transcription factors (TFs) controlling their target genes by binding to promoters and enhancers. Conceptually, the interactions between TFs, their binding sites, and their functional targets are represented by gene regulatory networks (GRNs). Deciphering in vivo GRNs underlying organ development in an unbiased genome-wide setting involves identifying both functional TF-gene interactions and physical TF-DNA interactions. To reverse engineer the GRNs of eye development in Drosophila, we performed RNA-seq across 72 genetic perturbations and sorted cell types and inferred a coexpression network. Next, we derived direct TF-DNA interactions using computational motif inference, ultimately connecting 241 TFs to 5,632 direct target genes through 24,926 enhancers. Using this network, we found network motifs, cis-regulatory codes, and regulators of eye development. We validate the predicted target regions of Grainyhead by ChIP-seq and identify this factor as a general cofactor in the eye network, being bound to thousands of nucleosome-free regions. PMID:25533349

  15. Directional Phosphorylation and Nuclear Transport of the Splicing Factor SRSF1 Is Regulated by an RNA Recognition Motif.

    PubMed

    Serrano, Pedro; Aubol, Brandon E; Keshwani, Malik M; Forli, Stefano; Ma, Chen-Ting; Dutta, Samit K; Geralt, Michael; Wüthrich, Kurt; Adams, Joseph A

    2016-06-01

    Multisite phosphorylation is required for the biological function of serine-arginine (SR) proteins, a family of essential regulators of mRNA splicing. These modifications are catalyzed by serine-arginine protein kinases (SRPKs) that phosphorylate numerous serines in arginine-serine-rich (RS) domains of SR proteins using a directional, C-to-N-terminal mechanism. The present studies explore how SRPKs govern this highly biased phosphorylation reaction and investigate biological roles of the observed directional phosphorylation mechanism. Using NMR spectroscopy with two separately expressed domains of SRSF1, we showed that several residues in the RNA-binding motif 2 interact with the N-terminal region of the RS domain (RS1). These contacts provide a structural framework that balances the activities of SRPK1 and the protein phosphatase PP1, thereby regulating the phosphoryl content of the RS domain. Disruption of the implicated intramolecular RNA-binding motif 2-RS domain interaction impairs both the directional phosphorylation mechanism and the nuclear translocation of SRSF1 demonstrating that the intrinsic phosphorylation bias is obligatory for SR protein biological function. PMID:27091468

  16. A cis-regulatory site downregulates PTHLH in translocation t(8;12)(q13;p11.2) and leads to Brachydactyly Type E

    PubMed Central

    Maass, Philipp G.; Wirth, Jutta; Aydin, Atakan; Rump, Andreas; Stricker, Sigmar; Tinschert, Sigrid; Otero, Miguel; Tsuchimochi, Kaneyuki; Goldring, Mary B.; Luft, Friedrich C.; Bähring, Sylvia

    2010-01-01

    Parathyroid hormone-like hormone (PTHLH) is an important chondrogenic regulator; however, the gene has not been directly linked to human disease. We studied a family with autosomal-dominant Brachydactyly Type E (BDE) and identified a t(8;12)(q13;p11.2) translocation with breakpoints (BPs) upstream of PTHLH on chromosome 12p11.2 and a disrupted KCNB2 on 8q13. We sequenced the BPs and identified a highly conserved Activator protein 1 (AP-1) motif on 12p11.2, together with a C-ets-1 motif translocated from 8q13. AP-1 and C-ets-1 bound in vitro and in vivo at the derivative chromosome 8 breakpoint [der(8) BP], but were differently enriched between the wild-type and BP allele. We differentiated fibroblasts from BDE patients into chondrogenic cells and found that PTHLH and its targets, ADAMTS-7 and ADAMTS-12 were downregulated along with impaired chondrogenic differentiation. We next used human and murine chondrocytes and observed that the AP-1 motif stimulated, whereas der(8) BP or C-ets-1 decreased, PTHLH promoter activity. These results are the first to identify a cis-directed PTHLH downregulation as primary cause of human chondrodysplasia. PMID:20015959

  17. The Significance of Multivalent Bonding Motifs and "Bond Order" in DNA-Directed Nanoparticle Crystallization.

    PubMed

    Thaner, Ryan V; Eryazici, Ibrahim; Macfarlane, Robert J; Brown, Keith A; Lee, Byeongdu; Nguyen, SonBinh T; Mirkin, Chad A

    2016-05-18

    Multivalent oligonucleotide-based bonding elements have been synthesized and studied for the assembly and crystallization of gold nanoparticles. Through the use of organic branching points, divalent and trivalent DNA linkers were readily incorporated into the oligonucleotide shells that define DNA-nanoparticles and compared to monovalent linker systems. These multivalent bonding motifs enable the change of "bond strength" between particles and therefore modulate the effective "bond order." In addition, the improved accessibility of strands between neighboring particles, either due to multivalency or modifications to increase strand flexibility, gives rise to superlattices with less strain in the crystallites compared to traditional designs. Furthermore, the increased availability and number of binding modes also provide a new variable that allows previously unobserved crystal structures to be synthesized, as evidenced by the formation of a thorium phosphide superlattice. PMID:27148838

  18. Microevolution of cis-regulatory elements: an example from the pair-rule segmentation gene fushi tarazu in the Drosophila melanogaster subgroup.

    PubMed

    Bakkali, Mohammed

    2011-01-01

    The importance of non-coding DNAs that control transcription is ever noticeable, but the characterization and analysis of the evolution of such DNAs presents challenges not found in the analysis of coding sequences. In this study of the cis-regulatory elements of the pair rule segmentation gene fushi tarazu (ftz) I report the DNA sequences of ftz's zebra element (promoter) and a region containing the proximal enhancer from a total of 45 fly lines belonging to several populations of the species Drosophila melanogaster, D. simulans, D. sechellia, D. mauritiana, D. yakuba, D. teissieri, D. orena and D. erecta. Both elements evolve at slower rate than ftz synonymous sites, thus reflecting their functional importance. The promoter evolves more slowly than the average for ftz's coding sequence while, on average, the enhancer evolves more rapidly, suggesting more functional constraint and effective purifying selection on the former. Comparative analysis of the number and nature of base substitutions failed to detect significant evidence for positive/adaptive selection in transcription-factor-binding sites. These seem to evolve at similar rates to regions not known to bind transcription factors. Although this result reflects the evolutionary flexibility of the transcription factor binding sites, it also suggests a complex and still not completely understood nature of even the characterized cis-regulatory sequences. The latter seem to contain more functional parts than those currently identified, some of which probably transcription factor binding. This study illustrates ways in which functional assignments of sequences within cis-acting sequences can be used in the search for adaptive evolution, but also highlights difficulties in how such functional assignment and analysis can be carried out. PMID:22073317

  19. A point mutation to Galphai selectively blocks GoLoco motif binding: direct evidence for Galpha.GoLoco complexes in mitotic spindle dynamics.

    PubMed

    Willard, Francis S; Zheng, Zhen; Guo, Juan; Digby, Gregory J; Kimple, Adam J; Conley, Jason M; Johnston, Christopher A; Bosch, Dustin; Willard, Melinda D; Watts, Val J; Lambert, Nevin A; Ikeda, Stephen R; Du, Quansheng; Siderovski, David P

    2008-12-26

    Heterotrimeric G-protein Galpha subunits and GoLoco motif proteins are key members of a conserved set of regulatory proteins that influence invertebrate asymmetric cell division and vertebrate neuroepithelium and epithelial progenitor differentiation. GoLoco motif proteins bind selectively to the inhibitory subclass (Galphai) of Galpha subunits, and thus it is assumed that a Galphai.GoLoco motif protein complex plays a direct functional role in microtubule dynamics underlying spindle orientation and metaphase chromosomal segregation during cell division. To address this hypothesis directly, we rationally identified a point mutation to Galphai subunits that renders a selective loss-of-function for GoLoco motif binding, namely an asparagine-to-isoleucine substitution in the alphaD-alphaE loop of the Galpha helical domain. This GoLoco-insensitivity ("GLi") mutation prevented Galphai1 association with all human GoLoco motif proteins and abrogated interaction between the Caenorhabditis elegans Galpha subunit GOA-1 and the GPR-1 GoLoco motif. In contrast, the GLi mutation did not perturb any other biochemical or signaling properties of Galphai subunits, including nucleotide binding, intrinsic and RGS protein-accelerated GTP hydrolysis, and interactions with Gbetagamma dimers, adenylyl cyclase, and seven transmembrane-domain receptors. GoLoco insensitivity rendered Galphai subunits unable to recruit GoLoco motif proteins such as GPSM2/LGN and GPSM3 to the plasma membrane, and abrogated the exaggerated mitotic spindle rocking normally seen upon ectopic expression of wild type Galphai subunits in kidney epithelial cells. This GLi mutation should prove valuable in establishing the physiological roles of Galphai.GoLoco motif protein complexes in microtubule dynamics and spindle function during cell division as well as to delineate potential roles for GoLoco motifs in receptor-mediated signal transduction. PMID:18984596

  20. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline.

    PubMed

    Landeen, Emily L; Muirhead, Christina A; Wright, Lori; Meiklejohn, Colin D; Presgraves, Daven C

    2016-07-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower-approximately 3-fold or more-for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  1. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline

    PubMed Central

    Landeen, Emily L.; Muirhead, Christina A.; Meiklejohn, Colin D.; Presgraves, Daven C.

    2016-01-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower—approximately 3-fold or more—for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  2. Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs

    PubMed Central

    Brown, Christopher D.; Mangravite, Lara M.; Engelhardt, Barbara E.

    2013-01-01

    Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic

  3. The lncRNA Malat1 is dispensable for mouse development but its transcription plays a cis-regulatory role in the adult

    PubMed Central

    Zhang, Bin; Arun, Gayatri; Mao, Yuntao S.; Lazar, Zsolt; Hung, Gene; Bhattacharjee, Gourab; Xiao, Xiaokun; Booth, Carmen J.; Wu, Jie; Zhang, Chaolin; Spector, David L.

    2012-01-01

    SUMMARY Genome-wide studies have identified thousands of long noncoding RNAs (lncRNAs) lacking protein coding capacity. However, most lncRNAs are expressed at a very low level, and in most cases there is no genetic evidence to support their in vivo function. Malat1 (metastasis associated lung adenocarcinoma transcript 1) is among the most abundant and highly conserved lncRNAs, and it exhibits an uncommon 3′-end processing mechanism. In addition, its specific nuclear localization, developmental regulation, and dysregulation in cancer are suggestive of it having a critical biological function. We have characterized a Malat1 loss-of-function genetic model that indicates Malat1 is not essential for mouse pre- and post-natal development. Furthermore, depletion of Malat1 does not impact global gene expression, splicing factor level and phosphorylation status, or alternative pre-mRNA splicing. However, among a small number of genes that were dysregulated in adult Malat1 knockout mice, many were Malat1 neighboring genes, thus indicating a potential cis regulatory role of Malat1 gene transcription. PMID:22840402

  4. A cis-Regulatory Mutation in Troponin-I of Drosophila Reveals the Importance of Proper Stoichiometry of Structural Proteins During Muscle Assembly

    PubMed Central

    Firdaus, Hena; Mohan, Jayaram; Naz, Sarwat; Arathi, Prabhashankar; Ramesh, Saraf R.; Nongthomba, Upendra

    2015-01-01

    Rapid and high wing-beat frequencies achieved during insect flight are powered by the indirect flight muscles, the largest group of muscles present in the thorax. Any anomaly during the assembly and/or structural impairment of the indirect flight muscles gives rise to a flightless phenotype. Multiple mutagenesis screens in Drosophila melanogaster for defective flight behavior have led to the isolation and characterization of mutations that have been instrumental in the identification of many proteins and residues that are important for muscle assembly, function, and disease. In this article, we present a molecular-genetic characterization of a flightless mutation, flightless-H (fliH), originally designated as heldup-a (hdp-a). We show that fliH is a cis-regulatory mutation of the wings up A (wupA) gene, which codes for the troponin-I protein, one of the troponin complex proteins, involved in regulation of muscle contraction. The mutation leads to reduced levels of troponin-I transcript and protein. In addition to this, there is also coordinated reduction in transcript and protein levels of other structural protein isoforms that are part of the troponin complex. The altered transcript and protein stoichiometry ultimately culminates in unregulated acto-myosin interactions and a hypercontraction muscle phenotype. Our results shed new insights into the importance of maintaining the stoichiometry of structural proteins during muscle assembly for proper function with implications for the identification of mutations and disease phenotypes in other species, including humans. PMID:25747460

  5. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes

    PubMed Central

    Uhl, Juli D.; Zandvakili, Arya; Gebelein, Brian

    2016-01-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  6. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping

    PubMed Central

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-01-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  7. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping.

    PubMed

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-04-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  8. Distinct cis-Regulatory Elements from the Dlx1/Dlx2 Locus Mark Different Progenitor Cell Populations in the Ganglionic Eminences and Different Subtypes of Adult Cortical Interneurons

    PubMed Central

    Ghanem, Noël; Yu, Man; Long, Jason; Hatch, Gary; Rubenstein, John L. R.; Ekker, Marc

    2016-01-01

    Distinct subtypes of cortical GABAergic interneurons provide inhibitory signals that are indispensable for neural network function. The Dlx homeobox genes have a central role in regulating their development and function. We have characterized the activity of three cis-regulatory sequences involved in forebrain expression of vertebrate Dlx genes: upstream regulatory element 2 (URE2), I12b, and I56i. The three regulatory elements display regional and temporal differences in their activities within the lateral ganglionic eminence (LGE), medial ganglionic eminence (MGE), and caudal ganglionic eminence (CGE) and label distinct populations of tangentially migrating neurons at embryonic day 12.5 (E12.5) and E13.5. We provide evidence that the dorsomedial and ventral MGE are distinct sources of tangentially migrating neurons during midgestation. In the adult cortex, URE2 and I12b/I56i are differentially expressed in parvalbumin-, calretinin-, neuropeptide Y-, and neuronal nitric oxide synthase-positive interneurons; I12b and I56i were specifically active in somatostatin-, vasoactive intestinal peptide-, and calbindin-positive interneurons. These data suggest that interneuron subtypes use distinct combinations of Dlx1/Dlx2 enhancers from the time they are specified through adulthood. PMID:17494687

  9. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes.

    PubMed

    Uhl, Juli D; Zandvakili, Arya; Gebelein, Brian

    2016-04-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  10. Autosomal recessive retinitis pigmentosa with homozygous rhodopsin mutation E150K and non-coding cis-regulatory variants in CRX-binding regions of SAMD7

    PubMed Central

    Van Schil, Kristof; Karlstetter, Marcus; Aslanidis, Alexander; Dannhausen, Katharina; Azam, Maleeha; Qamar, Raheel; Leroy, Bart P.; Depasse, Fanny; Langmann, Thomas; De Baere, Elfride

    2016-01-01

    The aim of this study was to unravel the molecular pathogenesis of an unusual retinitis pigmentosa (RP) phenotype observed in a Turkish consanguineous family. Homozygosity mapping revealed two candidate genes, SAMD7 and RHO. A homozygous RHO mutation c.448G > A, p.E150K was found in two affected siblings, while no coding SAMD7 mutations were identified. Interestingly, four non-coding homozygous variants were found in two SAMD7 genomic regions relevant for binding of the retinal transcription factor CRX (CRX-bound regions, CBRs) in these affected siblings. Three variants are located in a promoter CBR termed CBR1, while the fourth is located more downstream in CBR2. Transcriptional activity of these variants was assessed by luciferase assays and electroporation of mouse retinal explants with reporter constructs of wild-type and variant SAMD7 CBRs. The combined CBR2/CBR1 variant construct showed significantly decreased SAMD7 reporter activity compared to the wild-type sequence, suggesting a cis-regulatory effect on SAMD7 expression. As Samd7 is a recently identified Crx-regulated transcriptional repressor in retina, we hypothesize that these SAMD7 variants might contribute to the retinal phenotype observed here, characterized by unusual, recognizable pigment deposits, differing from the classic spicular intraretinal pigmentation observed in other individuals homozygous for p.E150K, and typically associated with RP in general. PMID:26887858

  11. Germ line and embryonic expression of Fex, a member of the Drosophila F-element retrotransposon family, is mediated by an internal cis-regulatory control region.

    PubMed Central

    Kerber, B; Fellert, S; Taubert, H; Hoch, M

    1996-01-01

    The F elements of Drosophila melanogaster belong to the superfamily of long interspersed nucleotide element retrotransposons. To date, F-element transcription has not been detected in flies. Here we describe the isolation of a member of the F-element family, termed Fex, which is transcribed in specific cells of the female and male germ lines and in various tissues during embryogenesis of D. melanogaster. Sequence analysis revealed that this element contains two complete open reading frames coding for a putative nucleic acid-binding protein and a putative reverse transcriptase. Functional analysis of the 5' region, using germ line transformation of Fex-lacZ reporter gene constructs, demonstrates that major aspects of tissue-specific Fex expression are controlled by internal cis-acting elements that lie in the putative coding region of open reading frame 1. These sequences mediate dynamic gene expression in eight expression domains during embryonic and germ line development. The capacity of the cis-regulatory region of the Fex element to mediate such complex expression patterns is unique among members of the long interspersed nucleotide element superfamily of retrotransposons and is reminiscent of regulatory regions of developmental control genes. PMID:8649411

  12. Direct Imaging of Hippocampal Epileptiform Calcium Motifs Following Kainic Acid Administration in Freely Behaving Mice

    PubMed Central

    Berdyyeva, Tamara K.; Frady, E. Paxon; Nassi, Jonathan J.; Aluisio, Leah; Cherkas, Yauheniya; Otte, Stephani; Wyatt, Ryan M.; Dugovic, Christine; Ghosh, Kunal K.; Schnitzer, Mark J.; Lovenberg, Timothy; Bonaventure, Pascal

    2016-01-01

    Prolonged exposure to abnormally high calcium concentrations is thought to be a core mechanism underlying hippocampal damage in epileptic patients; however, no prior study has characterized calcium activity during seizures in the live, intact hippocampus. We have directly investigated this possibility by combining whole-brain electroencephalographic (EEG) measurements with microendoscopic calcium imaging of pyramidal cells in the CA1 hippocampal region of freely behaving mice treated with the pro-convulsant kainic acid (KA). We observed that KA administration led to systematic patterns of epileptiform calcium activity: a series of large-scale, intensifying flashes of increased calcium fluorescence concurrent with a cluster of low-amplitude EEG waveforms. This was accompanied by a steady increase in cellular calcium levels (>5 fold increase relative to the baseline), followed by an intense spreading calcium wave characterized by a 218% increase in global mean intensity of calcium fluorescence (n = 8, range [114–349%], p < 10−4; t-test). The wave had no consistent EEG phenotype and occurred before the onset of motor convulsions. Similar changes in calcium activity were also observed in animals treated with 2 different proconvulsant agents, N-methyl-D-aspartate (NMDA) and pentylenetetrazol (PTZ), suggesting the measured changes in calcium dynamics are a signature of seizure activity rather than a KA-specific pathology. Additionally, despite reducing the behavioral severity of KA-induced seizures, the anticonvulsant drug valproate (VA, 300 mg/kg) did not modify the observed abnormalities in calcium dynamics. These results confirm the presence of pathological calcium activity preceding convulsive motor seizures and support calcium as a candidate signaling molecule in a pathway connecting seizures to subsequent cellular damage. Integrating in vivo calcium imaging with traditional assessment of seizures could potentially increase translatability of pharmacological

  13. Novel applications of motif-directed profiling to identify disease resistance genes in plants

    PubMed Central

    2013-01-01

    Background Molecular profiling of gene families is a versatile tool to study diversity between individual genomes in sexual crosses and germplasm. Nucleotide binding site (NBS) profiling, in particular, targets conserved nucleotide binding site-encoding sequences of resistance gene analogs (RGAs), and is widely used to identify molecular markers for disease resistance (R) genes. Results In this study, we used NBS profiling to identify genome-wide locations of RGA clusters in the genome of potato clone RH. Positions of RGAs in the potato RH and DM genomes that were generated using profiling and genome sequencing, respectively, were compared. Largely overlapping results, but also interesting discrepancies, were found. Due to the clustering of RGAs, several parts of the genome are overexposed while others remain underexposed using NBS profiling. It is shown how the profiling of other gene families, i.e. protein kinases and different protein domain-coding sequences (i.e., TIR), can be used to achieve a better marker distribution. The power of profiling techniques is further illustrated using RGA cluster-directed profiling in a population of Solanum berthaultii. Multiple different paralogous RGAs within the Rpi-ber cluster could be genetically distinguished. Finally, an adaptation of the profiling protocol was made that allowed the parallel sequencing of profiling fragments using next generation sequencing. The types of RGAs that were tagged in this next-generation profiling approach largely overlapped with classical gel-based profiling. As a potential application of next-generation profiling, we showed how the R gene family associated with late blight resistance in the SH*RH population could be identified using a bulked segregant approach. Conclusions In this study, we provide a comprehensive overview of previously described and novel profiling primers and their genomic targets in potato through genetic mapping and comparative genomics. Furthermore, it is shown how

  14. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site.

    PubMed

    Wongsantichon, Jantana; Robinson, Robert C; Ketterman, Albert J

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  15. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site

    PubMed Central

    Wongsantichon, Jantana; Robinson, Robert C.; Ketterman, Albert J.

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  16. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases

    PubMed Central

    Takács, Enikő; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G.

    2010-01-01

    dUTPases are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg2+-coordination. Our results on transient/steady-state kinetics, ligand-binding and a 1.80 Å-resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg2+ accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  17. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases.

    PubMed

    Takács, Eniko; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G

    2010-07-16

    dUTP pyrophosphatases (dUTPases) are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg(2+)-coordination. Our results on transient/steady-state kinetics, ligand binding and a 1.80 A resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg(2+) accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  18. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites

    PubMed Central

    Aday, Aaron W.; Zhu, Lihua Julie; Lakshmanan, Abirami; Wang, Jie; Lawson, Nathan D.

    2011-01-01

    An organism’s genome sequence serves as a blueprint for the proteins and regulatory RNAs essential for cellular function. The genome also harbors cis-acting non-coding sequences that control gene expression and are essential to coordinate regulatory programs during embryonic development. However, the genome sequence is largely identical between cell types within a multi-cellular organism indicating that factors such as DNA accessibility and chromatin structure play a crucial role in governing cell-specific gene expression. Recent studies have identified particular chromatin modifications that define functionally distinct cis regulatory elements. Among these are forms of histone 3 that are mono- or tri-methylated at lysine 4 (H3K4me1 or H3K4me3, respectively), which bind preferentially to promoter and enhancer elements in the mammalian genome. In this work, we investigated whether these modified histones could similarly identify cis regulatory elements within the zebrafish genome. By applying chromatin immunoprecipitation followed by deep sequencing, we find that H3K4me1 and H3K4me3 are enriched at transcriptional start sites in the genome of the developing zebrafish embryo and that this association correlates with gene expression. We further find that these modifications associate with distal non-coding conserved elements, including known active enhancers. Finally, we demonstrate that it is possible to utilize H3K4me1 and H3K4me3 binding profiles in combination with available expression data to computationally identify relevant cis regulatory sequences flanking syn-expressed genes in the developing embryo. Taken together, our results indicate that H3K4me1 and H3K4me3 generally mark cis regulatory elements within the zebrafish genome and indicate that further characterization of the zebrafish using this approach will prove valuable in defining transcriptional networks in this model system. PMID:21435340

  19. Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial Model

    PubMed Central

    Halfon, Marc S.; Grad, Yonatan; Church, George M.; Michelson, Alan M.

    2002-01-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development. [Sequence data reported in this paper have been deposited in GenBank with accession nos. AF513981 (Eve MHE) and AF513982 (Hbr DME). Supplementary material is available online at http://www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: R. Blackman] PMID:12097338

  20. Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

    PubMed Central

    Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

    2014-01-01

    Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522

  1. Identification of potential regulatory motifs in odorant receptor genes by analysis of promoter sequences

    PubMed Central

    Michaloski, Jussara S.; Galante, Pedro A.F.

    2006-01-01

    Mouse odorant receptors (ORs) are encoded by >1000 genes dispersed throughout the genome. Each olfactory neuron expresses one single OR gene, while the rest of the genes remain silent. The mechanisms underlying OR gene expression are poorly understood. Here, we investigated if OR genes share common cis-regulatory sequences in their promoter regions. We carried out a comprehensive analysis in which the upstream regions of a large number of OR genes were compared. First, using RLM-RACE, we generated cDNAs containing the complete 5′-untranslated regions (5′-UTRs) for a total number of 198 mouse OR genes. Then, we aligned these cDNA sequences to the mouse genome so that the 5′ structure and transcription start sites (TSSs) of the OR genes could be precisely determined. Sequences upstream of the TSSs were retrieved and browsed for common elements. We found DNA sequence motifs that are overrepresented in the promoter regions of the OR genes. Most motifs resemble O/E-like sites and are preferentially localized within 200 bp upstream of the TSSs. Finally, we show that these motifs specifically interact with proteins extracted from nuclei prepared from the olfactory epithelium, but not from brain or liver. Our results show that the OR genes share common promoter elements. The present strategy should provide information on the role played by cis-regulatory sequences in OR gene regulation. PMID:16902085

  2. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

    PubMed Central

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-01-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . PMID:15980501

  3. A novel pairwise comparison method for in silico discovery of statistically significant cis-regulatory elements in eukaryotic promoter regions: application to Arabidopsis.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Aliakbari, Massumeh; Lindlöf, Angelica; Ebrahimi, Mahdi; Ebrahimie, Esmaeil

    2015-01-01

    Cis regulatory elements (CREs), located within promoter regions, play a significant role in the blueprint for transcriptional regulation of genes. There is a growing interest to study the combinatorial nature of CREs including presence or absence of CREs, the number of occurrences of each CRE, as well as of their order and location relative to their target genes. Comparative promoter analysis has been shown to be a reliable strategy to test the significance of each component of promoter architecture. However, it remains unclear what level of difference in the number of occurrences of each CRE is of statistical significance in order to explain different expression patterns of two genes. In this study, we present a novel statistical approach for pairwise comparison of promoters of Arabidopsis genes in the context of number of occurrences of each CRE within the promoters. First, using the sample of 1000 Arabidopsis promoters, the results of the goodness of fit test and non-parametric analysis revealed that the number of occurrences of CREs in a promoter sequence is Poisson distributed. As a promoter sequence contained functional and non-functional CREs, we addressed the issue of the statistical distribution of functional CREs by analyzing the ChIP-seq datasets. The results showed that the number of occurrences of functional CREs over the genomic regions was determined as being Poisson distributed. In accordance with the obtained distribution of CREs occurrences, we suggested the Audic and Claverie (AC) test to compare two promoters based on the number of occurrences for the CREs. Superiority of the AC test over Chi-square (2×2) and Fisher's exact tests was also shown, as the AC test was able to detect a higher number of significant CREs. The two case studies on the Arabidopsis genes were performed in order to biologically verify the pairwise test for promoter comparison. Consequently, a number of CREs with significantly different occurrences was identified between

  4. Conformational Flexibility and Dynamics of the Internal Loop and Helical Regions of the Kink-Turn Motif in the Glycine Riboswitch by Site-Directed Spin-Labeling.

    PubMed

    Esquiaqui, Jackie M; Sherman, Eileen M; Ye, Jing-Dong; Fanucci, Gail E

    2016-08-01

    Site-directed spin-labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy provides a means for a solution state description of site-specific dynamics and flexibility of large RNAs, facilitating our understanding of the effects of environmental conditions such as ligands and ions on RNA structure and dynamics. Here, the utility and capability of EPR line shape analysis and distance measurements to monitor and describe site-specific changes in the conformational dynamics of internal loop nucleobases as well as helix-helix interactions of the kink-turn motif in the Vibrio cholerae (VC) glycine riboswitch that occur upon sequential K(+)-, Mg(2+)-, and glycine-induced folding were explored. Spin-labels were incorporated into the 232-nucleotide sequence via splinted ligation strategies. Thiouridine nucleobase labeling within the internal loop reveals unambiguous differential dynamics for two successive sites labeled, with varied rates of motion reflective of base flipping and base stacking. EPR-based distance measurements for nitroxide spin-labels incorporated within the RNA backbone in the helical regions of the kink-turn motif are reflective of helical formation and tertiary interaction induced by ion stabilization. In both instances, results indicate that the structural formation of the kink-turn motif in the VC glycine riboswitch can be stabilized by 100 mM K(+) where the conformational flexibility of the kink-turn motif is not further tightened by subsequent addition of divalent ions. Although glycine binding is likely to induce structural and dynamic changes in other regions, SDSL indicates no impact of glycine binding on the local dynamics or structure of the kink-turn motif as investigated here. Overall, these results demonstrate the ability of SDSL to interrogate site-specific base dynamics and packing of helices in large RNAs and demonstrate ion-induced stability of the kink-turn fold of the VC riboswitch. PMID:27427937

  5. Cis-Regulatory Elements Determine Germline Specificity and Expression Level of an Isopentenyltransferase Gene in Sperm Cells of Arabidopsis1[OPEN

    PubMed Central

    Yuan, Tong; Duan, Xiaomeng; Wei, Xiaoping; Li, Jia

    2016-01-01

    Flowering plant sperm cells transcribe a divergent and complex complement of genes. To examine promoter function, we chose an isopentenyltransferase gene known as PzIPT1. This gene is highly selectively transcribed in one sperm cell morphotype of Plumbago zeylanica, which preferentially fuses with the central cell during fertilization and is thus a founding cell of the primary endosperm. In transgenic Arabidopsis (Arabidopsis thaliana), PzIPT1 promoter displays activity in both sperm cells and upon progressive promoter truncation from the 5′-end results in a progressive decrease in reporter production, consistent with occurrence of multiple enhancer sites. Cytokinin-dependent protein binding motifs are identified in the promoter sequence, which respond with stimulation by cytokinin. Expression of PzIPT1 promoter in sperm cells confers specificity independently of previously reported Germline Restrictive Silencer Factor binding sequence. Instead, a cis-acting regulatory region consisting of two duplicated 6-bp Male Gamete Selective Activation (MGSA) motifs occurs near the site of transcription initiation. Disruption of this sequence-specific site inactivates expression of a GFP reporter gene in sperm cells. Multiple copies of the MGSA motif fused with the minimal CaMV35S promoter elements confer reporter gene expression in sperm cells. Similar duplicated MGSA motifs are also identified from promoter sequences of sperm cell-expressed genes in Arabidopsis, suggesting selective activation is possibly a common mechanism for regulation of gene expression in sperm cells of flowering plants. PMID:26739233

  6. Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

    PubMed

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J; Hehl, Reinhard

    2012-09-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  7. Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

    PubMed Central

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

    2012-01-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  8. The strain-specific cis-acting element of beet curly top geminivirus DNA replication maps to the directly repeated motif of the ori.

    PubMed

    Choi, I R; Stenger, D C

    1996-12-01

    Strains of beet curly top geminivirus (BCTV) possess distinct cis- and trans-acting replication specificity elements which are not separately interchangeable among strains. Analysis of the replication competency of chimeric BCTV genomes, in which portions of the origin of DNA replication (ori) were derived from heterologous BCTV strains, have permitted identification of an essential cis-acting element governing strain-specific replication in a subgroup II geminivirus. Our studies indicate that the cis-acting element responsible for strain-specific replication properties resides within the directly repeated motif of the BCTV ori. Transient replication assays conducted in leaf disks and complementation experiments conducted in whole plants indicated that the trans-acting replication specificity element, residing within the amino-terminal region of the C1 Rep protein, may recognize and replicate a chimeric BCTV genome containing a heterologous ori so long as all or portions of the core element of the directly repeated motif are derived from the same strain as the Rep protein. As Rep protein binding to the core element of the directly repeated motif has been demonstrated by others to be essential for replication of subgroup III geminiviruses, our results support the hypothesis that replication specificity of subgroup II viruses is governed by processes similar to that of subgroup III viruses. However, a second cis-acting element of the ori, which appears to contribute to subgroup III virus replication specificity, does not seem to be required for replication specificity among the subgroup II viruses examined. Nonetheless, a potential role for a second cis-acting element in the BCTV ori contributing to maximal replication cannot be excluded. PMID:8941329

  9. Defining a Conformational Consensus Motif in Cotransin-Sensitive Signal Sequences: A Proteomic and Site-Directed Mutagenesis Study

    PubMed Central

    Klein, Wolfgang; Westendorf, Carolin; Schmidt, Antje; Conill-Cortés, Mercè; Rutz, Claudia; Blohs, Marcus; Beyermann, Michael; Protze, Jonas; Krause, Gerd; Krause, Eberhard; Schülein, Ralf

    2015-01-01

    The cyclodepsipeptide cotransin was described to inhibit the biosynthesis of a small subset of proteins by a signal sequence-discriminatory mechanism at the Sec61 protein-conducting channel. However, it was not clear how selective cotransin is, i.e. how many proteins are sensitive. Moreover, a consensus motif in signal sequences mediating cotransin sensitivity has yet not been described. To address these questions, we performed a proteomic study using cotransin-treated human hepatocellular carcinoma cells and the stable isotope labelling by amino acids in cell culture technique in combination with quantitative mass spectrometry. We used a saturating concentration of cotransin (30 micromolar) to identify also less-sensitive proteins and to discriminate the latter from completely resistant proteins. We found that the biosynthesis of almost all secreted proteins was cotransin-sensitive under these conditions. In contrast, biosynthesis of the majority of the integral membrane proteins was cotransin-resistant. Cotransin sensitivity of signal sequences was neither related to their length nor to their hydrophobicity. Instead, in the case of signal anchor sequences, we identified for the first time a conformational consensus motif mediating cotransin sensitivity. PMID:25806945

  10. AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana.

    PubMed

    Steffens, Nils Ole; Galuschka, Claudia; Schindler, Martin; Bülow, Lorenz; Hehl, Reinhard

    2005-07-01

    The AthaMap database generates a map of cis-regulatory elements for the Arabidopsis thaliana genome. AthaMap contains more than 7.4 x 10(6) putative binding sites for 36 transcription factors (TFs) from 16 different TF families. A newly implemented functionality allows the display of subsets of higher conserved transcription factor binding sites (TFBSs). Furthermore, a web tool was developed that permits a user-defined search for co-localizing cis-regulatory elements. The user can specify individually the level of conservation for each TFBS and a spacer range between them. This web tool was employed for the identification of co-localizing sites of known interacting TFs and TFs containing two DNA-binding domains. More than 1.8 x 10(5) combinatorial elements were annotated in the AthaMap database. These elements can also be used to identify more complex co-localizing elements consisting of up to four TFBSs. The AthaMap database and the connected web tools are a valuable resource for the analysis and the prediction of gene expression regulation at http://www.athamap.de. PMID:15980498

  11. Functional characterization of transcription factor motifs using cross-species comparison across large evolutionary distances.

    PubMed

    Kim, Jaebum; Cunningham, Ryan; James, Brian; Wyder, Stefan; Gibson, Joshua D; Niehuis, Oliver; Zdobnov, Evgeny M; Robertson, Hugh M; Robinson, Gene E; Werren, John H; Sinha, Saurabh

    2010-01-01

    We address the problem of finding statistically significant associations between cis-regulatory motifs and functional gene sets, in order to understand the biological roles of transcription factors. We develop a computational framework for this task, whose features include a new statistical score for motif scanning, the use of different scores for predicting targets of different motifs, and new ways to deal with redundancies among significant motif-function associations. This framework is applied to the recently sequenced genome of the jewel wasp, Nasonia vitripennis, making use of the existing knowledge of motifs and gene annotations in another insect genome, that of the fruitfly. The framework uses cross-species comparison to improve the specificity of its predictions, and does so without relying upon non-coding sequence alignment. It is therefore well suited for comparative genomics across large evolutionary divergences, where existing alignment-based methods are not applicable. We also apply the framework to find motifs associated with socially regulated gene sets in the honeybee, Apis mellifera, using comparisons with Nasonia, a solitary species, to identify honeybee-specific associations. PMID:20126523

  12. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  13. Aminopeptidase B, a glucagon-processing enzyme: site directed mutagenesis of the Zn2+-binding motif and molecular modelling

    PubMed Central

    Pham, Viet-Laï; Cadel, Marie-Sandrine; Gouzy-Darmon, Cécile; Hanquez, Chantal; Beinfeld, Margery C; Nicolas, Pierre; Etchebest, Catherine; Foulon, Thierry

    2007-01-01

    Background Aminopeptidase B (Ap-B; EC 3.4.11.6) catalyzes the cleavage of basic residues at the N-terminus of peptides and processes glucagon into miniglucagon. The enzyme exhibits, in vitro, a residual ability to hydrolyze leukotriene A4 into the pro-inflammatory lipid mediator leukotriene B4. The potential bi-functional nature of Ap-B is supported by close structural relationships with LTA4 hydrolase (LTA4H ; EC 3.3.2.6). A structure-function analysis is necessary for the detailed understanding of the enzymatic mechanisms of Ap-B and to design inhibitors, which could be used to determine the complete in vivo functions of the enzyme. Results The rat Ap-B cDNA was expressed in E. coli and the purified recombinant enzyme was characterized. 18 mutants of the H325EXXHX18E348 Zn2+-binding motif were constructed and expressed. All mutations were found to abolish the aminopeptidase activity. A multiple alignment of 500 sequences of the M1 family of aminopeptidases was performed to identify 3 sub-families of exopeptidases and to build a structural model of Ap-B using the x-ray structure of LTA4H as a template. Although the 3D structures of the two enzymes resemble each other, they differ in certain details. The role that a loop, delimiting the active center of Ap-B, plays in discriminating basic substrates, as well as the function of consensus motifs, such as RNP1 and Armadillo domain are discussed. Examination of electrostatic potentials and hydrophobic patches revealed important differences between Ap-B and LTA4H and suggests that Ap-B is involved in protein-protein interactions. Conclusion Alignment of the primary structures of the M1 family members clearly demonstrates the existence of different sub-families and highlights crucial residues in the enzymatic activity of the whole family. E. coli recombinant enzyme and Ap-B structural model constitute powerful tools for investigating the importance and possible roles of these conserved residues in Ap-B, LTA4H and M1

  14. O-xylosylation in a recombinant protein is directed at a common motif on glycine-serine linkers.

    PubMed

    Spencer, David; Novarra, Shabazz; Zhu, Liang; Mugabe, Sheila; Thisted, Thomas; Baca, Manuel; Depaz, Roberto; Barton, Christopher

    2013-11-01

    Glycine-serine (GS) linkers are commonly used in recombinant proteins to connect domains. Here, we report the posttranslational O-glycosylation of a GS linker in a novel fusion protein. The structure of the O-glycan moiety is a xylose-based core substituted with hexose and sulfated hexauronic acid residues. The total level of O-xylosylation was approximately 30% in the material expressed in HEK-293 cell lines. There was an approximate 10-fold reduction in O-xylosylation levels when the material was expressed in Chinese hamster ovary cell lines. Similar O-glycan structures have been reported for human urinary thrombomodulin and represent the initial building block for proteoglycans such as chondroitin sulfate and heparin. The sites of attachment, determined by electron transfer dissociation mass spectrometry, were localized to serine in the linker regions of the recombinant fusion protein. This attachment could be attributed, in part, to the inherent xylosyltransferase motif present in GS linkers. Elimination of the O-glycan moiety was achieved with modified linkers containing only glycine residues. The aggregation and fragmentation behavior of the GGG construct were comparable to the GSG-linked material during thermal stress. The O-xylosylation reported has implications for the manufacturing consistency of recombinant proteins containing GS linkers. PMID:24105735

  15. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy. PMID:26202617

  16. A G-string positive cis-regulatory element in the LpS1 promoter binds two distinct nuclear factors distributed non-uniformly in Lytechinus pictus embryos.

    PubMed

    Xiang, M; Lu, S Y; Musso, M; Karsenty, G; Klein, W H

    1991-12-01

    The LpS1 alpha and beta genes of Lytechinus pictus are activated at the late cleavage stage of embryogenesis, with LpS1 mRNAs accumulating only in lineages contributing to aboral ectoderm. We had shown previously that 762 bp of 5' flanking DNA from the LpS1 beta gene was sufficient for proper temporal and aboral ectoderm specific expression. In the present study, we identified a strong positive cis-regulatory element at -70 bp to -75 bp in the LpS1 beta promoter with the sequence (G)6 and a similar, more distal cis-element at -721 bp to -726 bp. The proximal 'G-string' element interacted with two nuclear factors, one specific to ectoderm and one to endoderm/mesoderm nuclear extracts, whereas the distal G-string element interacted only with the ectoderm factor. The ectoderm and endoderm/mesoderm G-string factors were distinct based on their migratory behavior in electrophoretic mobility shift assays, binding site specificities, salt optima and EDTA sensitivity. The proximal G-string element shared homology with a binding site for the mammalian transcription factor IF1, a protein that binds to negative cis-regulatory elements in the mouse alpha 1(I) and alpha 2(I) collagen gene promoters. Competition experiments using wild-type and mutant oligonucleotides indicated that the ectoderm G-string factor and IF1 have similar recognition sites. Partially purified IF1 specifically bound to an oligonucleotide containing the proximal G-string of LpS1 beta. From our results, we suggest that the ectoderm G-string factor, a member of the G-rich DNA-binding protein family, activates the LpS1 gene in aboral ectoderm cells by binding to the LpS1 promoter at the proximal G-string site. PMID:1811948

  17. Fast and Efficient Cloning of Cis-Regulatory Sequences for High-Throughput Yeast One-Hybrid Analyses of Transcription Factors.

    PubMed

    Kelemen, Zsolt; Przybyla-Toscano, Jonathan; Tissot, Nicolas; Lepiniec, Loïc; Dubos, Christian

    2016-01-01

    Yeast one-hybrid (Y1H) assay has been proven to be a powerful technique to characterize in vivo the interaction between a given transcription factor (TF), or its DNA-binding domain (DBD), and target DNA sequences. Comprehensive characterization of TF/DBD and DNA interactions should allow designing synthetic promoters that would undoubtedly be valuable for biotechnological approaches. Here, we use the ligation-independent cloning system (LIC) in order to enhance the cloning efficiency of DNA motifs into the pHISi Y1H vector. LIC overcomes important limitations of traditional cloning technologies, since any DNA fragment can be cloned into LIC compatible vectors without using restriction endonucleases, ligation, or in vitro recombination. PMID:27557765

  18. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements

    PubMed Central

    Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs. PMID:26760036

  19. Scavenger Chemokine (CXC Motif) Receptor 7 (CXCR7) Is a Direct Target Gene of HIC1 (Hypermethylated in Cancer 1)*

    PubMed Central

    Van Rechem, Capucine; Rood, Brian R.; Touka, Majid; Pinte, Sébastien; Jenal, Mathias; Guérardel, Cateline; Ramsey, Keri; Monté, Didier; Bégue, Agnès; Tschan, Mario P.; Stephan, Dietrich A.; Leprince, Dominique

    2009-01-01

    The tumor suppressor gene HIC1 (Hypermethylated in Cancer 1) that is epigenetically silenced in many human tumors and is essential for mammalian development encodes a sequence-specific transcriptional repressor. The few genes that have been reported to be directly regulated by HIC1 include ATOH1, FGFBP1, SIRT1, and E2F1. HIC1 is thus involved in the complex regulatory loops modulating p53-dependent and E2F1-dependent cell survival and stress responses. We performed genome-wide expression profiling analyses to identify new HIC1 target genes, using HIC1-deficient U2OS human osteosarcoma cells infected with adenoviruses expressing either HIC1 or GFP as a negative control. These studies identified several putative direct target genes, including CXCR7, a G-protein-coupled receptor recently identified as a scavenger receptor for the chemokine SDF-1/CXCL12. CXCR7 is highly expressed in human breast, lung, and prostate cancers. Using quantitative reverse transcription-PCR analyses, we demonstrated that CXCR7 was repressed in U2OS cells overexpressing HIC1. Inversely, inactivation of endogenous HIC1 by RNA interference in normal human WI38 fibroblasts results in up-regulation of CXCR7 and SIRT1. In silico analyses followed by deletion studies and luciferase reporter assays identified a functional and phylogenetically conserved HIC1-responsive element in the human CXCR7 promoter. Moreover, chromatin immunoprecipitation (ChIP) and ChIP upon ChIP experiments demonstrated that endogenous HIC1 proteins are bound together with the C-terminal binding protein corepressor to the CXCR7 and SIRT1 promoters in WI38 cells. Taken together, our results implicate the tumor suppressor HIC1 in the transcriptional regulation of the chemokine receptor CXCR7, a key player in the promotion of tumorigenesis in a wide variety of cell types. PMID:19525223

  20. Are mutagenic non D-loop direct repeat motifs in mitochondrial DNA under a negative selection pressure?

    PubMed Central

    Lakshmanan, Lakshmi Narayanan; Gruber, Jan; Halliwell, Barry; Gunawan, Rudiyanto

    2015-01-01

    Non D-loop direct repeats (DRs) in mitochondrial DNA (mtDNA) have been commonly implicated in the mutagenesis of mtDNA deletions associated with neuromuscular disease and ageing. Further, these DRs have been hypothesized to put a constraint on the lifespan of mammals and are under a negative selection pressure. Using a compendium of 294 mammalian mtDNA, we re-examined the relationship between species lifespan and the mutagenicity of such DRs. Contradicting the prevailing hypotheses, we found no significant evidence that long-lived mammals possess fewer mutagenic DRs than short-lived mammals. By comparing DR counts in human mtDNA with those in selectively randomized sequences, we also showed that the number of DRs in human mtDNA is primarily determined by global mtDNA properties, such as the bias in synonymous codon usage (SCU) and nucleotide composition. We found that SCU bias in mtDNA positively correlates with DR counts, where repeated usage of a subset of codons leads to more frequent DR occurrences. While bias in SCU and nucleotide composition has been attributed to nucleotide mutational bias, mammalian mtDNA still exhibit higher SCU bias and DR counts than expected from such mutational bias, suggesting a lack of negative selection against non D-loop DRs. PMID:25855815

  1. Overlapping CRE and E Box Motifs in the Enhancer Sequences of the Bovine Leukemia Virus 5′ Long Terminal Repeat Are Critical for Basal and Acetylation-Dependent Transcriptional Activity of the Viral Promoter: Implications for Viral Latency

    PubMed Central

    Calomme, Claire; Dekoninck, Ann; Nizet, Séverine; Adam, Emmanuelle; Nguyên, Thi Liên-Anh; Van Den Broeke, Anne; Willems, Luc; Kettmann, Richard; Burny, Arsène; Lint, Carine Van

    2004-01-01

    Bovine leukemia virus (BLV) infection is characterized by viral latency in a large proportion of cells containing an integrated provirus. In this study, we postulated that mechanisms directing the recruitment of deacetylases to the BLV 5′ long terminal repeat (LTR) could explain the transcriptional repression of viral expression in vivo. Accordingly, we showed that BLV promoter activity was induced by several deacetylase inhibitors (such as trichostatin A [TSA]) in the context of episomal LTR constructs and in the context of an integrated BLV provirus. Moreover, treatment of BLV-infected cells with TSA increased H4 acetylation at the viral promoter, showing a close correlation between the level of histone acetylation and transcriptional activation of the BLV LTR. Among the known cis-regulatory DNA elements located in the 5′ LTR, three E box motifs overlapping cyclic AMP responsive elements (CREs) in U3 were shown to be involved in transcriptional repression of BLV basal gene expression. Importantly, the combined mutations of these three E box motifs markedly reduced the inducibility of the BLV promoter by TSA. E boxes are susceptible to recognition by transcriptional repressors such as Max-Mad-mSin3 complexes that repress transcription by recruiting deacetylases. However, our in vitro binding studies failed to reveal the presence of Mad-Max proteins in the BLV LTR E box-specific complexes. Remarkably, TSA increased the occupancy of the CREs by CREB/ATF. Therefore, we postulated that the E box-specific complexes exerted their negative cooperative effect on BLV transcription by steric hindrance with the activators CREB/ATF and/or their transcriptional coactivators possessing acetyltransferase activities. Our results thus suggest that the overlapping CRE and E box elements in the BLV LTR were selected during evolution as a novel strategy for BLV to allow better silencing of viral transcription and to escape from the host immune response. PMID:15564493

  2. Directed evolution reveals the binding motif preference of the LC8/DYNLL hub protein and predicts large numbers of novel binders in the human proteome.

    PubMed

    Rapali, Péter; Radnai, László; Süveges, Dániel; Harmat, Veronika; Tölgyesi, Ferenc; Wahlgren, Weixiao Y; Katona, Gergely; Nyitray, László; Pál, Gábor

    2011-01-01

    LC8 dynein light chain (DYNLL) is a eukaryotic hub protein that is thought to function as a dimerization engine. Its interacting partners are involved in a wide range of cellular functions. In its dozens of hitherto identified binding partners DYNLL binds to a linear peptide segment. The known segments define a loosely characterized binding motif: [D/S](-4)K(-3)X(-2)[T/V/I](-1)Q(0)[T/V](1)[D/E](2). The motifs are localized in disordered segments of the DYNLL-binding proteins and are often flanked by coiled coil or other potential dimerization domains. Based on a directed evolution approach, here we provide the first quantitative characterization of the binding preference of the DYNLL binding site. We displayed on M13 phage a naïve peptide library with seven fully randomized positions around a fixed, naturally conserved glutamine. The peptides were presented in a bivalent manner fused to a leucine zipper mimicking the natural dimer to dimer binding stoichiometry of DYNLL-partner complexes. The phage-selected consensus sequence V(-5)S(-4)R(-3)G(-2)T(-1)Q(0)T(1)E(2) resembles the natural one, but is extended by an additional N-terminal valine, which increases the affinity of the monomeric peptide twentyfold. Leu-zipper dimerization increases the affinity into the subnanomolar range. By comparing crystal structures of an SRGTQTE-DYNLL and a dimeric VSRGTQTE-DYNLL complex we find that the affinity enhancing valine is accommodated in a binding pocket on DYNLL. Based on the in vitro evolved sequence pattern we predict a large number of novel DYNLL binding partners in the human proteome. Among these EML3, a microtubule-binding protein involved in mitosis contains an exact match of the phage-evolved consensus and binds to DYNLL with nanomolar affinity. These results significantly widen the scope of the human interactome around DYNLL and will certainly shed more light on the biological functions and organizing role of DYNLL in the human and other eukaryotic interactomes

  3. Directed Evolution Reveals the Binding Motif Preference of the LC8/DYNLL Hub Protein and Predicts Large Numbers of Novel Binders in the Human Proteome

    PubMed Central

    Rapali, Péter; Radnai, László; Süveges, Dániel; Harmat, Veronika; Tölgyesi, Ferenc; Wahlgren, Weixiao Y.; Katona, Gergely; Nyitray, László; Pál, Gábor

    2011-01-01

    LC8 dynein light chain (DYNLL) is a eukaryotic hub protein that is thought to function as a dimerization engine. Its interacting partners are involved in a wide range of cellular functions. In its dozens of hitherto identified binding partners DYNLL binds to a linear peptide segment. The known segments define a loosely characterized binding motif: [D/S]-4K-3X-2[T/V/I]-1Q0[T/V]1[D/E]2. The motifs are localized in disordered segments of the DYNLL-binding proteins and are often flanked by coiled coil or other potential dimerization domains. Based on a directed evolution approach, here we provide the first quantitative characterization of the binding preference of the DYNLL binding site. We displayed on M13 phage a naïve peptide library with seven fully randomized positions around a fixed, naturally conserved glutamine. The peptides were presented in a bivalent manner fused to a leucine zipper mimicking the natural dimer to dimer binding stoichiometry of DYNLL-partner complexes. The phage-selected consensus sequence V-5S-4R-3G-2T-1Q0T1E2 resembles the natural one, but is extended by an additional N-terminal valine, which increases the affinity of the monomeric peptide twentyfold. Leu-zipper dimerization increases the affinity into the subnanomolar range. By comparing crystal structures of an SRGTQTE-DYNLL and a dimeric VSRGTQTE-DYNLL complex we find that the affinity enhancing valine is accommodated in a binding pocket on DYNLL. Based on the in vitro evolved sequence pattern we predict a large number of novel DYNLL binding partners in the human proteome. Among these EML3, a microtubule-binding protein involved in mitosis contains an exact match of the phage-evolved consensus and binds to DYNLL with nanomolar affinity. These results significantly widen the scope of the human interactome around DYNLL and will certainly shed more light on the biological functions and organizing role of DYNLL in the human and other eukaryotic interactomes. PMID:21533121

  4. Mining Conditional Phosphorylation Motifs.

    PubMed

    Liu, Xiaoqing; Wu, Jun; Gong, Haipeng; Deng, Shengchun; He, Zengyou

    2014-01-01

    Phosphorylation motifs represent position-specific amino acid patterns around the phosphorylation sites in the set of phosphopeptides. Several algorithms have been proposed to uncover phosphorylation motifs, whereas the problem of efficiently discovering a set of significant motifs with sufficiently high coverage and non-redundancy still remains unsolved. Here we present a novel notion called conditional phosphorylation motifs. Through this new concept, the motifs whose over-expressiveness mainly benefits from its constituting parts can be filtered out effectively. To discover conditional phosphorylation motifs, we propose an algorithm called C-Motif for a non-redundant identification of significant phosphorylation motifs. C-Motif is implemented under the Apriori framework, and it tests the statistical significance together with the frequency of candidate motifs in a single stage. Experiments demonstrate that C-Motif outperforms some current algorithms such as MMFPh and Motif-All in terms of coverage and non-redundancy of the results and efficiency of the execution. The source code of C-Motif is available at: https://sourceforge. net/projects/cmotif/. PMID:26356863

  5. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development. PMID:26632170

  6. Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information

    PubMed Central

    Makeev, Vsevolod J.; Lifanov, Alexander P.; Nazina, Anna G.; Papatsenko, Dmitri A.

    2003-01-01

    We explored distance preferences in the arrangement of binding motifs for five transcription factors (Bicoid, Krüppel, Hunchback, Knirps and Caudal) in a large set of Drosophila cis-regulatory modules (CRMs). Analysis of non-overlapping binding motifs revealed the presence of periodic signals specific to particular combinations of binding motifs. The most striking periodic signals (10 bp for Bicoid and 11 bp for Hunchback) suggest preferential positioning of some binding site combinations on the same side of the DNA helix. We also analyzed distance preferences in arrangements of highly correlated overlapping binding motifs, such as Bicoid and Krüppel. Based on the distance analysis, we extracted preferential binding site arrangements and proposed models for potential composite elements (CEs) and antagonistic motif pairs involved in the function of developmental CRMs. Our results suggest that there are distinct hierarchical levels in the organization of transcription regulatory information. We discuss the role of the hierarchy in understanding transcriptional regulation and in detection of transcription regulatory regions in genomes. PMID:14530449

  7. The ESEV PDZ-Binding Motif of the Avian Influenza A Virus NS1 Protein Protects Infected Cells from Apoptosis by Directly Targeting Scribble▿

    PubMed Central

    Liu, Hongbing; Golebiewski, Lisa; Dow, Eugene C.; Krug, Robert M.; Javier, Ronald T.; Rice, Andrew P.

    2010-01-01

    The NS1 protein from influenza A viruses contains a four-amino-acid sequence at its carboxyl terminus that is termed the PDZ-binding motif (PBM). The NS1 PBM is predicted to bind to cellular PDZ proteins and functions as a virulence determinant in infected mice. ESEV is the consensus PBM sequence of avian influenza viruses, while RSKV is the consensus sequence of human viruses. Currently circulating highly pathogenic H5N1 influenza viruses encode an NS1 protein with the ESEV PBM. We identified cellular targets of the avian ESEV PBM and identified molecular mechanisms involved in its function. Using glutathione S-transferase (GST) pull-down assays, we found that the ESEV PBM enables NS1 to associate with the PDZ proteins Scribble, Dlg1, MAGI-1, MAGI-2, and MAGI-3. Because Scribble possesses a proapoptotic activity, we investigated the interaction between NS1 and Scribble. The association between NS1 and Scribble is direct and requires the ESEV PBM and two Scribble PDZ domains. We constructed recombinant H3N2 viruses that encode an H6N6 avian virus NS1 protein with either an ESEV or mutant ESEA PBM, allowing an analysis of the ESEV PBM in infections in mammalian cells. The ESEV PBM enhanced viral replication up to 4-fold. In infected cells, NS1 with the ESEV PBM relocalized Scribble into cytoplasmic puncta concentrated in perinuclear regions and also protected cells from apoptosis. In addition, the latter effect was eliminated by small interfering RNA (siRNA)-mediated Scribble depletion. This study shows that one function of the avian ESEV PBM is to reduce apoptosis during infection through disruption of Scribble's proapoptotic function. PMID:20702615

  8. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules.

    PubMed

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C; Hoppler, Stefan

    2016-06-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  9. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules

    PubMed Central

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C.; Hoppler, Stefan

    2016-01-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  10. Redox active motifs in selenoproteins.

    PubMed

    Li, Fei; Lutz, Patricia B; Pepelyayeva, Yuliya; Arnér, Elias S J; Bayse, Craig A; Rozovsky, Sharon

    2014-05-13

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used (77)Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of (77)Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs' reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20-25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs' flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed. PMID:24769567

  11. Characterization of the human lipoprotein lipase (LPL) promoter: Evidence of two cis-regulatory regions, LP-[alpha] and LP-[beta] of importance for the differentation-linked induction of the LPL gene during adipogenesis

    SciTech Connect

    Enerbaeck, S.; Ohlsson, B.G.; Samuelsson, L.; Bjursell, G. )

    1992-10-01

    When preadipocytes differentiate into adipocytes, several differentiation-linked genes are activated. Lipo-protein lipase (LPL) is one of the first genes induced during this process. To investigate early events in adipocyte development, we have focused on the transcriptional activation of the LPL gene. For this purpose, we have cloned and fused different parts of intragenic and flanking sequences with a chloramphenicol acetyltransferase reporter gene. Transient transfection experiments and DNase I hypersensitivity assays indicate that several positive as well as negative elements contribute to transcriptional regulation of the LPL gene. When reporter gene constructs were stably introduced into preadipocytes, we were able to monitor and compare the activation patterns of different promoter deletion mutants at selected time points representing the process of adipocyte development. We could delimit two cis-regulatory elements important for gradual activation of the LPL gene during adipocyte development in vitro. These elements, LP-[alpha] (-702 to -666) and LP-[beta] (-468 to -430), contain a striking similarity to a consensus sequence known to bind the transcription factors HNF-3 and fork head. Results of gel mobility shift assays and DNase I and exonuclease III in vitro protection assays indicate that factors with DNA-binding properties similar to those of the HNF-3/fork head family of transcription factors are present in adipocytes and interact with LP-[alpha] and LP-[beta]. We also demonstrate that LP-[alpha] and LP-[beta] were both capable of conferring a differentiation-linked expression pattern to a heterolog promoter, thus mimicking the expression of the endogenous LPL gene during adipocyte differentiation. These findings indicate that interactions with LP-[alpha] and LP-[beta] could be a part of a differentiation switch governing induction of the LPL gene during adipocyte differentiation. 48 refs., 11 figs.

  12. Motif module map reveals enforcement of aging by continual NF-κB activity

    PubMed Central

    Adler, Adam S.; Sinha, Saurabh; Kawahara, Tiara L.A.; Zhang, Jennifer Y.; Segal, Eran; Chang, Howard Y.

    2007-01-01

    Aging is characterized by specific alterations in gene expression, but their underlying mechanisms and functional consequences are not well understood. Here we develop a systematic approach to identify combinatorial cis-regulatory motifs that drive age-dependent gene expression across different tissues and organisms. Integrated analysis of 365 microarrays spanning nine tissue types predicted fourteen motifs as major regulators of age-dependent gene expression in human and mouse. The motif most strongly associated with aging was that of the transcription factor NF-κB. Inducible genetic blockade of NF-κB for 2 wk in the epidermis of chronologically aged mice reverted the tissue characteristics and global gene expression programs to those of young mice. Age-specific NF-κB blockade and orthogonal cell cycle interventions revealed that NF-κB controls cell cycle exit and gene expression signature of aging in parallel but not sequential pathways. These results identify a conserved network of regulatory pathways underlying mammalian aging and show that NF-κB is continually required to enforce many features of aging in a tissue-specific manner. PMID:18055696

  13. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  14. Fast approximate motif statistics.

    PubMed

    Nicodème, P

    2001-01-01

    We present in this article a fast approximate method for computing the statistics of a number of non-self-overlapping matches of motifs in a random text in the nonuniform Bernoulli model. This method is well suited for protein motifs where the probability of self-overlap of motifs is small. For 96% of the PROSITE motifs, the expectations of occurrences of the motifs in a 7-million-amino-acids random database are computed by the approximate method with less than 1% error when compared with the exact method. Processing of the whole PROSITE takes about 30 seconds with the approximate method. We apply this new method to a comparison of the C. elegans and S. cerevisiae proteomes. PMID:11535175

  15. Redox active motifs in selenoproteins

    PubMed Central

    Li, Fei; Lutz, Patricia B.; Pepelyayeva, Yuliya; Arnér, Elias S. J.; Bayse, Craig A.; Rozovsky, Sharon

    2014-01-01

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used 77Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of 77Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs’ reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20–25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs’ flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed. PMID:24769567

  16. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  17. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  18. Lysine residues direct the chlorination of tyrosines in YXXK motifs of apolipoprotein A-I when hypochlorous acid oxidizes high density lipoprotein.

    PubMed

    Bergt, Constanze; Fu, Xiaoyun; Huq, Nabiha P; Kao, Jeff; Heinecke, Jay W

    2004-02-27

    Oxidized lipoproteins may play an important role in the pathogenesis of atherosclerosis. Elevated levels of 3-chlorotyrosine, a specific end product of the reaction between hypochlorous acid (HOCl) and tyrosine residues of proteins, have been detected in atherosclerotic tissue. Thus, HOCl generated by the phagocyte enzyme myeloperoxidase represents one pathway for protein oxidation in humans. One important target of the myeloperoxidase pathway may be high density lipoprotein (HDL), which mobilizes cholesterol from artery wall cells. To determine whether activated phagocytes preferentially chlorinate specific sites in HDL, we used tandem mass spectrometry (MS/MS) to analyze apolipoprotein A-I that had been oxidized by HOCl. The major site of chlorination was a single tyrosine residue located in one of the protein's YXXK motifs (where X represents a nonreactive amino acid). To investigate the mechanism of chlorination, we exposed synthetic peptides to HOCl. The peptides encompassed the amino acid sequences YKXXY, YXXKY, or YXXXY. MS/MS analysis demonstrated that chlorination of tyrosine in the peptides that contained lysine was regioselective and occurred in high yield if the substrate was KXXY or YXXK. NMR and MS analyses revealed that the N(epsilon) amino group of lysine was initially chlorinated, which suggests that chloramine formation is the first step in tyrosine chlorination. Molecular modeling of the YXXK motif in apolipoprotein A-I demonstrated that these tyrosine and lysine residues are adjacent on the same face of an amphipathic alpha-helix. Our observations suggest that HOCl selectively targets tyrosine residues that are suitably juxtaposed to primary amino groups in proteins. This mechanism might enable phagocytes to efficiently damage proteins when they destroy microbial proteins during infection or damage host tissue during inflammation. PMID:14660678

  19. Efficient exact motif discovery

    PubMed Central

    Marschall, Tobias; Rahmann, Sven

    2009-01-01

    Motivation: The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. Results: We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. Availability and Implementation: The method has been implemented in Java. It can be obtained from http://ls11-www

  20. Cross-Disciplinary Detection and Analysis of Network Motifs

    PubMed Central

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein–protein interaction network, primary school contact network, Zachary’s karate club network, and co-purchase of political books network can be classified into a superfamily. PMID:25983553

  1. Cross-disciplinary detection and analysis of network motifs.

    PubMed

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein-protein interaction network, primary school contact network, Zachary's karate club network, and co-purchase of political books network can be classified into a superfamily. PMID:25983553

  2. No tradeoff between versatility and robustness in gene circuit motifs

    NASA Astrophysics Data System (ADS)

    Payne, Joshua L.

    2016-05-01

    Circuit motifs are small directed subgraphs that appear in real-world networks significantly more often than in randomized networks. In the Boolean model of gene circuits, most motifs are realized by multiple circuit genotypes. Each of a motif's constituent circuit genotypes may have one or more functions, which are embodied in the expression patterns the circuit forms in response to specific initial conditions. Recent enumeration of a space of nearly 17 million three-gene circuit genotypes revealed that all circuit motifs have more than one function, with the number of functions per motif ranging from 12 to nearly 30,000. This indicates that some motifs are more functionally versatile than others. However, the individual circuit genotypes that constitute each motif are less robust to mutation if they have many functions, hinting that functionally versatile motifs may be less robust to mutation than motifs with few functions. Here, I explore the relationship between versatility and robustness in circuit motifs, demonstrating that functionally versatile motifs are robust to mutation despite the inherent tradeoff between versatility and robustness at the level of an individual circuit genotype.

  3. Mining protein sequences for motifs.

    PubMed

    Narasimhan, Giri; Bu, Changsong; Gao, Yuan; Wang, Xuning; Xu, Ning; Mathee, Kalai

    2002-01-01

    We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs in protein sequences. The algorithm assumes that a motif is constituted by the presence of a "good" combination of residues in appropriate locations of the motif. The algorithm attempts to compile such good combinations into a "pattern dictionary" by processing an aligned training set of protein sequences. The dictionary is subsequently used to detect motifs in new protein sequences. Statistical significance of the detection results are ensured by statistically determining the various parameters of the algorithm. Based on this approach, we have implemented a program called GYM. The Helix-Turn-Helix motif was used as a model system on which to test our program. The program was also extended to detect Homeodomain motifs. The detection results for the two motifs compare favorably with existing programs. In addition, the GYM program provides a lot of useful information about a given protein sequence. PMID:12487759

  4. The Transcriptional Complex Between the BCL2 i-Motif and hnRNP LL Is a Molecular Switch for Control of Gene Expression That Can Be Modulated by Small Molecules

    PubMed Central

    2015-01-01

    In a companion paper (DOI: 10.021/ja410934b) we demonstrate that the C-rich strand of the cis-regulatory element in the BCL2 promoter element is highly dynamic in nature and can form either an i-motif or a flexible hairpin. Under physiological conditions these two secondary DNA structures are found in an equilibrium mixture, which can be shifted by the addition of small molecules that trap out either the i-motif (IMC-48) or the flexible hairpin (IMC-76). In cellular experiments we demonstrate that the addition of these molecules has opposite effects on BCL2 gene expression and furthermore that these effects are antagonistic. In this contribution we have identified a transcriptional factor that recognizes and binds to the BCL2 i-motif to activate transcription. The molecular basis for the recognition of the i-motif by hnRNP LL is determined, and we demonstrate that the protein unfolds the i-motif structure to form a stable single-stranded complex. In subsequent experiments we show that IMC-48 and IMC-76 have opposite, antagonistic effects on the formation of the hnRNP LL–i-motif complex as well as on the transcription factor occupancy at the BCL2 promoter. For the first time we propose that the i-motif acts as a molecular switch that controls gene expression and that small molecules that target the dynamic equilibrium of the i-motif and the flexible hairpin can differentially modulate gene expression. PMID:24559432

  5. Structural alphabet motif discovery and a structural motif database.

    PubMed

    Ku, Shih-Yen; Hu, Yuh-Jyh

    2012-01-01

    This study proposes a general framework for structural motif discovery. The framework is based on a modular design in which the system components can be modified or replaced independently to increase its applicability to various studies. It is a two-stage approach that first converts protein 3D structures into structural alphabet sequences, and then applies a sequence motif-finding tool to these sequences to detect conserved motifs. We named the structural motif database we built the SA-Motifbase, which provides the structural information conserved at different hierarchical levels in SCOP. For each motif, SA-Motifbase presents its 3D view; alphabet letter preference; alphabet letter frequency distribution; and the significance. SA-Motifbase is available at http://bioinfo.cis.nctu.edu.tw/samotifbase/. PMID:22099701

  6. Structural and Mechanistic Analysis of Trichodiene Synthase Using Site-Directed Mutagenesis: Probing the Catalytic Function of Tryosine-295 and the Asparagine-225/Serine-229/Glutamate-233-Mg2+ B Motif

    SciTech Connect

    Vedula,L.; Jiang, J.; Zakharian, T.; Cane, D.; Christianson, D.

    2008-01-01

    Trichodiene synthase from Fusarium sporotrichioides contains two metal ion-binding motifs required for the cyclization of farnesyl diphosphate: the 'aspartate-rich' motif D100DXX(D/E) that coordinates to Mg{sup 2+}{sub A} and Mg{sup 2+}{sub C} source, and the 'NSE/DTE' motif N225DXXSXXXE that chelates Mg{sup 2+}{sub b} (boldface indicates metal ion ligands). Here, we report steady-state kinetic parameters, product array analyses, and X-ray crystal structures of trichodiene synthase mutants in which the fungal NSE motif is progressively converted into a plant-like DDXXTXXXE motif, resulting in a degradation in both steady-state kinetic parameters and product specificity. Each catalytically active mutant generates a different distribution of sesquiterpene products, and three newly detected sesquiterpenes are identified. In addition, the kinetic and structural properties of the Y295F mutant of trichodiene synthase were found to be similar to those of the wild-type enzyme, thereby ruling out a proposed role for Y295 in catalysis.

  7. A G-Box-Like Motif Is Necessary for Transcriptional Regulation by Circadian Pseudo-Response Regulators in Arabidopsis1[OPEN

    PubMed Central

    Newton, Linsey; Liu, Ming-Jung

    2016-01-01

    PSEUDO-RESPONSE REGULATORs (PRRs) play overlapping and distinct roles in maintaining circadian rhythms and regulating diverse biological processes, including the photoperiodic control of flowering, growth, and abiotic stress responses. PRRs act as transcriptional repressors and associate with chromatin via their conserved C-terminal CCT (CONSTANS, CONSTANS-like, and TIMING OF CAB EXPRESSION 1 [TOC1/PRR1]) domains by a still-poorly understood mechanism. Here, we identified genome-wide targets of PRR9 using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) and compared them with PRR7, PRR5, and TOC1/PRR1 ChIP-seq data. We found that PRR binding sites are located within genomic regions of low nucleosome occupancy and high DNase I hypersensitivity. Moreover, conserved noncoding regions among Brassicaceae species are enriched around PRR binding sites, indicating that PRRs associate with functionally relevant cis-regulatory regions. The PRRs shared a significant number of binding regions, and our results indicate that they coordinately restrict the expression of target genes to around dawn. A G-box-like motif was overrepresented at PRR binding regions, and we showed that this motif is necessary for mediating transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 and PRR9 by the PRRs. Our results further our understanding of how PRRs target specific promoters and provide an extensive resource for studying circadian regulatory networks in plants. PMID:26586835

  8. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  9. The Annotation of RNA Motifs

    PubMed Central

    2002-01-01

    The recent deluge of new RNA structures, including complete atomic-resolution views of both subunits of the ribosome, has on the one hand literally overwhelmed our individual abilities to comprehend the diversity of RNA structure, and on the other hand presented us with new opportunities for comprehensive use of RNA sequences for comparative genetic, evolutionary and phylogenetic studies. Two concepts are key to understanding RNA structure: hierarchical organization of global structure and isostericity of local interactions. Global structure changes extremely slowly, as it relies on conserved long-range tertiary interactions. Tertiary RNA–RNA and quaternary RNA–protein interactions are mediated by RNA motifs, defined as recurrent and ordered arrays of non-Watson–Crick base-pairs. A single RNA motif comprises a family of sequences, all of which can fold into the same three-dimensional structure and can mediate the same interaction(s). The chemistry and geometry of base pairing constrain the evolution of motifs in such a way that random mutations that occur within motifs are accepted or rejected insofar as they can mediate a similar ordered array of interactions. The steps involved in the analysis and annotation of RNA motifs in 3D structures are: (a) decomposition of each motif into non-Watson–Crick base-pairs; (b) geometric classification of each basepair; (c) identification of isosteric substitutions for each basepair by comparison to isostericity matrices; (d) alignment of homologous sequences using the isostericity matrices to identify corresponding positions in the crystal structure; (e) acceptance or rejection of the null hypothesis that the motif is conserved. PMID:18629252

  10. Network motifs emerge from interconnections that favour stability

    NASA Astrophysics Data System (ADS)

    Angulo, Marco Tulio; Liu, Yang-Yu; Slotine, Jean-Jacques

    2015-10-01

    The microscopic principles organizing dynamic units in complex networks--from proteins to power generators--can be understood in terms of network `motifs’: small interconnection patterns that appear much more frequently in real networks than expected in random networks. When considered as small subgraphs isolated from a large network, these motifs are more robust to parameter variations, easier to synchronize than other possible subgraphs, and can provide specific functionalities. But one can isolate these subgraphs only by assuming, for example, a significant separation of timescales, and the origin of network motifs and their functionalities when embedded in larger networks remain unclear. Here we show that most motifs emerge from interconnection patterns that best exploit the intrinsic stability characteristics at different scales of interconnection, from simple nodes to whole modules. This functionality suggests an efficient mechanism to stably build complex systems by recursively interconnecting nodes and modules as motifs. We present direct evidence of this mechanism in several biological networks.

  11. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses. PMID:26524912

  12. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  13. From Cis-Regulatory Elements to Complex RNPs and Back

    PubMed Central

    Gebauer, Fátima; Preiss, Thomas; Hentze, Matthias W.

    2012-01-01

    Messenger RNAs (mRNAs), the templates for translation, have evolved to harbor abundant cis-acting sequences that affect their posttranscriptional fates. These elements are frequently located in the untranslated regions and serve as binding sites for trans-acting factors, RNA-binding proteins, and/or small non-coding RNAs. This article provides a systematic synopsis of cis-acting elements, trans-acting factors, and the mechanisms by which they affect translation. It also highlights recent technical advances that have ushered in the era of transcriptome-wide studies of the ribonucleoprotein complexes formed by mRNAs and their trans-acting factors. PMID:22751153

  14. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

    SciTech Connect

    Joslyn, Cliff A.; al-Saffar, Sinan; Haglin, David J.; Holder, Larry

    2011-06-14

    Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

  15. A comprehensive analysis of the La-motif protein superfamily

    PubMed Central

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-01-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits. PMID:19299548

  16. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  17. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  18. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs

    PubMed Central

    Zheng, Yiyu; Li, Xiaoman; Hu, Haiyan

    2015-01-01

    Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5′ distal regions were often enriched in 3′ distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/. PMID:25505144

  19. Detecting correlations among functional-sequence motifs

    NASA Astrophysics Data System (ADS)

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

  20. Detecting correlations among functional-sequence motifs.

    PubMed

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features. PMID:23005179

  1. A survey of DNA motif finding algorithms

    PubMed Central

    Das, Modan K; Dai, Ho-Kwok

    2007-01-01

    Background Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms. Results Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms. Conclusion Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of

  2. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  3. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis.

    PubMed

    Jovelin, Richard; Krizus, Aldis; Taghizada, Bakhtiyar; Gray, Jeremy C; Phillips, Patrick C; Claycomb, Julie M; Cutter, Asher D

    2016-07-01

    MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes. PMID:27140965

  4. The Thiamine-Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Ciszak, Ewa; Dominiak, Paulina

    2004-01-01

    Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.

  5. Synthetic biology with RNA motifs.

    PubMed

    Saito, Hirohide; Inoue, Tan

    2009-02-01

    Structural motifs in naturally occurring RNAs and RNPs can be employed as new molecular parts for synthetic biology to facilitate the development of novel devices and systems that modulate cellular functions. In this review, we focus on the following: (i) experimental evolution techniques of RNA molecules in vitro and (ii) their applications for regulating gene expression systems in vivo. For experimental evolution, new artificial RNA aptamers and RNA enzymes (ribozymes) have been selected in vitro. These functional RNA molecules are likely to be applicable in the reprogramming of existing gene regulatory systems. Furthermore, they may be used for designing hypothetical RNA-based living systems in the so-called RNA world. For the regulation of gene expressions in living cells, the development of new riboswitches allows us to modulate the target gene expression in a tailor-made manner. Moreover, recently RNA-based synthetic genetic circuits have been reported by employing functional RNA molecules, expanding the repertory of synthetic biology with RNA motifs. PMID:18775792

  6. DILIMOT: discovery of linear motifs in proteins.

    PubMed

    Neduva, Victor; Russell, Robert B

    2006-07-01

    Discovery of protein functional motifs is critical in modern biology. Small segments of 3-10 residues play critical roles in protein interactions, post-translational modifications and trafficking. DILIMOT (DIscovery of LInear MOTifs) is a server for the prediction of these short linear motifs within a set of proteins. Given a set of sequences sharing a common functional feature (e.g. interaction partner or localization) the method finds statistically over-represented motifs likely to be responsible for it. The input sequences are first passed through a set of filters to remove regions unlikely to contain instances of linear motifs. Motifs are then found in the remaining sequence and ranked according to a statistic that measure over-representation and conservation across homologues in related species. The results are displayed via a visual interface for easy perusal. The server is available at http://dilimot.embl.de. PMID:16845024

  7. Bridge and brick motifs in complex networks

    NASA Astrophysics Data System (ADS)

    Huang, Chung-Yuan; Sun, Chuen-Tsai; Cheng, Chia-Ying; Hsieh, Ji-Lung

    2007-04-01

    Acknowledging the expanding role of complex networks in numerous scientific contexts, we examine significant functional and topological differences between bridge and brick motifs for predicting network behaviors and functions. After observing similarities between social networks and their genetic, ecological, and engineering counterparts, we identify a larger number of brick motifs in social networks and bridge motifs in the other three types. We conclude that bridge and brick motif content analysis can assist researchers in understanding the small-world and clustering properties of network structures when investigating network functions and behaviors.

  8. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-01-01

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. PMID:24555784

  9. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  10. Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties of their dynamical state space

    PubMed Central

    Ahnert, S. E.; Fink, T. M. A.

    2016-01-01

    Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the ‘function’ of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. PMID:27440255

  11. Ordered cyclic motifs contribute to dynamic stability in biological and engineered networks

    PubMed Central

    Ma'ayan, Avi; Cecchi, Guillermo A.; Wagner, John; Rao, A. Ravi; Iyengar, Ravi; Stolovitzky, Gustavo

    2008-01-01

    Representation and analysis of complex biological and engineered systems as directed networks is useful for understanding their global structure/function organization. Enrichment of network motifs, which are over-represented subgraphs in real networks, can be used for topological analysis. Because counting network motifs is computationally expensive, only characterization of 3- to 5-node motifs has been previously reported. In this study we used a supercomputer to analyze cyclic motifs made of 3–20 nodes for 6 biological and 3 technological networks. Using tools from statistical physics, we developed a theoretical framework for characterizing the ensemble of cyclic motifs in real networks. We have identified a generic property of real complex networks, antiferromagnetic organization, which is characterized by minimal directional coherence of edges along cyclic subgraphs, such that consecutive links tend to have opposing direction. As a consequence, we find that the lack of directional coherence in cyclic motifs leads to depletion in feedback loops, where the number of nodes affected by feedback loops appears to be at a local minimum compared with surrogate shuffled networks. This topology provides more dynamic stability in large networks. PMID:19033453

  12. Automated Motif Discovery from Glycan Array Data

    PubMed Central

    Cholleti, Sharath R.; Agravat, Sanjay; Morris, Tim; Saltz, Joel H.; Song, Xuezheng

    2012-01-01

    Abstract Assessing interactions of a glycan-binding protein (GBP) or lectin with glycans on a microarray generates large datasets, making it difficult to identify a glycan structural motif or determinant associated with the highest apparent binding strength of the GBP. We have developed a computational method, termed GlycanMotifMiner, that uses the relative binding of a GBP with glycans within a glycan microarray to automatically reveal the glycan structural motifs recognized by a GBP. We implemented the software with a web-based graphical interface for users to explore and visualize the discovered motifs. The utility of GlycanMotifMiner was determined using five plant lectins, SNA, HPA, PNA, Con A, and UEA-I. Data from the analyses of the lectins at different protein concentrations were processed to rank the glycans based on their relative binding strengths. The motifs, defined as glycan substructures that exist in a large number of the bound glycans and few non-bound glycans, were then discovered by our algorithm and displayed in a web-based graphical user interface (http://glycanmotifminer.emory.edu). The information is used in defining the glycan-binding specificity of GBPs. The results were compared to the known glycan specificities of these lectins generated by manual methods. A more complex analysis was also carried out using glycan microarray data obtained for a recombinant form of human galectin-8. Results for all of these lectins show that GlycanMotifMiner identified the major motifs known in the literature along with some unexpected novel binding motifs. PMID:22877213

  13. Automated motif discovery from glycan array data.

    PubMed

    Cholleti, Sharath R; Agravat, Sanjay; Morris, Tim; Saltz, Joel H; Song, Xuezheng; Cummings, Richard D; Smith, David F

    2012-10-01

    Assessing interactions of a glycan-binding protein (GBP) or lectin with glycans on a microarray generates large datasets, making it difficult to identify a glycan structural motif or determinant associated with the highest apparent binding strength of the GBP. We have developed a computational method, termed GlycanMotifMiner, that uses the relative binding of a GBP with glycans within a glycan microarray to automatically reveal the glycan structural motifs recognized by a GBP. We implemented the software with a web-based graphical interface for users to explore and visualize the discovered motifs. The utility of GlycanMotifMiner was determined using five plant lectins, SNA, HPA, PNA, Con A, and UEA-I. Data from the analyses of the lectins at different protein concentrations were processed to rank the glycans based on their relative binding strengths. The motifs, defined as glycan substructures that exist in a large number of the bound glycans and few non-bound glycans, were then discovered by our algorithm and displayed in a web-based graphical user interface ( http://glycanmotifminer.emory.edu ). The information is used in defining the glycan-binding specificity of GBPs. The results were compared to the known glycan specificities of these lectins generated by manual methods. A more complex analysis was also carried out using glycan microarray data obtained for a recombinant form of human galectin-8. Results for all of these lectins show that GlycanMotifMiner identified the major motifs known in the literature along with some unexpected novel binding motifs. PMID:22877213

  14. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

    PubMed

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-02-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1,28,000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks. PMID:22156162

  15. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  16. CodingMotif: exact determination of overrepresented nucleotide motifs in coding sequences

    PubMed Central

    2012-01-01

    Background It has been increasingly appreciated that coding sequences harbor regulatory sequence motifs in addition to encoding for protein. These sequence motifs are expected to be overrepresented in nucleotide sequences bound by a common protein or small RNA. However, detecting overrepresented motifs has been difficult because of interference by constraints at the protein level. Sampling-based approaches to solve this problem based on codon-shuffling have been limited to exploring only an infinitesimal fraction of the sequence space and by their use of parametric approximations. Results We present a novel O(N(log N)2)-time algorithm, CodingMotif, to identify nucleotide-level motifs of unusual copy number in protein-coding regions. Using a new dynamic programming algorithm we are able to exhaustively calculate the distribution of the number of occurrences of a motif over all possible coding sequences that encode the same amino acid sequence, given a background model for codon usage and dinucleotide biases. Our method takes advantage of the sparseness of loci where a given motif can occur, greatly speeding up the required convolution calculations. Knowledge of the distribution allows one to assess the exact non-parametric p-value of whether a given motif is over- or under- represented. We demonstrate that our method identifies known functional motifs more accurately than sampling and parametric-based approaches in a variety of coding datasets of various size, including ChIP-seq data for the transcription factors NRSF and GABP. Conclusions CodingMotif provides a theoretically and empirically-demonstrated advance for the detection of motifs overrepresented in coding sequences. We expect CodingMotif to be useful for identifying motifs in functional genomic datasets such as DNA-protein binding, RNA-protein binding, or microRNA-RNA binding within coding regions. A software implementation is available at http://bioinformatics.bc.edu/chuanglab/codingmotif.tar PMID

  17. Identification of a putative nuclear export signal motif in human NANOG homeobox domain

    SciTech Connect

    Park, Sung-Won; Do, Hyun-Jin; Huh, Sun-Hyung; Sung, Boreum; Uhm, Sang-Jun; Song, Hyuk; Kim, Nam-Hyung; Kim, Jae-Hwan

    2012-05-11

    Highlights: Black-Right-Pointing-Pointer We found the putative nuclear export signal motif within human NANOG homeodomain. Black-Right-Pointing-Pointer Leucine-rich residues are important for human NANOG homeodomain nuclear export. Black-Right-Pointing-Pointer CRM1-specific inhibitor LMB blocked the potent human NANOG NES-mediated nuclear export. -- Abstract: NANOG is a homeobox-containing transcription factor that plays an important role in pluripotent stem cells and tumorigenic cells. To understand how nuclear localization of human NANOG is regulated, the NANOG sequence was examined and a leucine-rich nuclear export signal (NES) motif ({sup 125}MQELSNILNL{sup 134}) was found in the homeodomain (HD). To functionally validate the putative NES motif, deletion and site-directed mutants were fused to an EGFP expression vector and transfected into COS-7 cells, and the localization of the proteins was examined. While hNANOG HD exclusively localized to the nucleus, a mutant with both NLSs deleted and only the putative NES motif contained (hNANOG HD-{Delta}NLSs) was predominantly cytoplasmic, as observed by nucleo/cytoplasmic fractionation and Western blot analysis as well as confocal microscopy. Furthermore, site-directed mutagenesis of the putative NES motif in a partial hNANOG HD only containing either one of the two NLS motifs led to localization in the nucleus, suggesting that the NES motif may play a functional role in nuclear export. Furthermore, CRM1-specific nuclear export inhibitor LMB blocked the hNANOG potent NES-mediated export, suggesting that the leucine-rich motif may function in CRM1-mediated nuclear export of hNANOG. Collectively, a NES motif is present in the hNANOG HD and may be functionally involved in CRM1-mediated nuclear export pathway.

  18. Miz-1 Activates Gene Expression via a Novel Consensus DNA Binding Motif

    PubMed Central

    Barrilleaux, Bonnie L.; Burow, Dana; Lockwood, Sarah H.; Yu, Abigail; Segal, David J.; Knoepfler, Paul S.

    2014-01-01

    The transcription factor Miz-1 can either activate or repress gene expression in concert with binding partners including the Myc oncoprotein. The genomic binding of Miz-1 includes both core promoters and more distal sites, but the preferred DNA binding motif of Miz-1 has been unclear. We used a high-throughput in vitro technique, Bind-n-Seq, to identify two Miz-1 consensus DNA binding motif sequences—ATCGGTAATC and ATCGAT (Mizm1 and Mizm2)—bound by full-length Miz-1 and its zinc finger domain, respectively. We validated these sequences directly as high affinity Miz-1 binding motifs. Competition assays using mutant probes indicated that the binding affinity of Miz-1 for Mizm1 and Mizm2 is highly sequence-specific. Miz-1 strongly activates gene expression through the motifs in a Myc-independent manner. MEME-ChIP analysis of Miz-1 ChIP-seq data in two different cell types reveals a long motif with a central core sequence highly similar to the Mizm1 motif identified by Bind-n-Seq, validating the in vivo relevance of the findings. Miz-1 ChIP-seq peaks containing the long motif are predominantly located outside of proximal promoter regions, in contrast to peaks without the motif, which are highly concentrated within 1.5 kb of the nearest transcription start site. Overall, our results indicate that Miz-1 may be directed in vivo to the novel motif sequences we have identified, where it can recruit its specific binding partners to control gene expression and ultimately regulate cell fate. PMID:24983942

  19. MotifMiner: A Table Driven Greedy Algorithm for DNA Motif Mining

    NASA Astrophysics Data System (ADS)

    Seeja, K. R.; Alam, M. A.; Jain, S. K.

    DNA motif discovery is a much explored problem in functional genomics. This paper describes a table driven greedy algorithm for discovering regulatory motifs in the promoter sequences of co-expressed genes. The proposed algorithm searches both DNA strands for the common patterns or motifs. The inputs to the algorithm are set of promoter sequences, the motif length and minimum Information Content. The algorithm generates subsequences of given length from the shortest input promoter sequence. It stores these subsequences and their reverse complements in a table. Then it searches the remaining sequences for good matches of these subsequences. The Information Content score is used to measure the goodness of the motifs. The algorithm has been tested with synthetic data and real data. The results are found promising. The algorithm could discover meaningful motifs from the muscle specific regulatory sequences.

  20. DNA Motif Databases and Their Uses.

    PubMed

    Stormo, Gary D

    2015-01-01

    Transcription factors (TFs) recognize and bind to specific DNA sequences. The specificity of a TF is usually represented as a position weight matrix (PWM). Several databases of DNA motifs exist and are used in biological research to address important biological questions. This overview describes PWMs and some of the most commonly used motif databases, as well as a few of their common applications. PMID:26334922

  1. Two Di-Leucine Motifs Regulate Trafficking of Mucolipin-1 to Lysosomes

    PubMed Central

    Vergarajauregui, Silvia; Puertollano, Rosa

    2006-01-01

    Mutations in the mucolipin-1 gene have been linked to mucolipidosis type IV, a lysosomal storage disorder characterized by severe neurological and ophthalmologic abnormalities. Mucolipin-1 is a membrane protein containing six putative transmembrane domains with both its N- and C-termini localized facing the cytosol. To gain information on the sorting motifs that mediate the trafficking of this protein to lysosomes, we have generated chimeras in which the N- and C- terminal tail portions of mucolipin-1 were fused to a reporter gene. In this article, we report the identification of two separate di-leucine-type motifs that co-operate to regulate the transport of mucolipin-1 to lysosomes. One di-leucine motif is positioned at the N-terminal cytosolic tail and mediates direct transport to lysosomes, whereas the other di-leucine motif is found at the C-terminal tail and functions as an adaptor protein 2-dependent internalization motif. We have also found that the C-terminal tail of mucolipin-1 is palmitoylated and that this modification might regulate the efficiency of endocytosis. Finally, the mutagenesis of both di-leucine motifs abrogated lysosomal accumulation and resulted in cell-surface redistribution of mucolipin-1. Taken together, these results reveal novel information regarding the motifs that regulate mucolipin-1 trafficking and suggest a role for palmitoylation in protein sorting. PMID:16497227

  2. Basic OSF/Motif programming and applications

    SciTech Connect

    Brooks, D. ); Novak, B. )

    1992-09-15

    When users refer to Motif, they are usually talking about mwm, the window manager. However, when programmers mention Motif they are usually discussing the programming toolkit. This toolkit is used to develop new or modify existing applications. In this presentation, the term Motif will refer to the toolkit. Motif comes with a number of features that help users effectively use the applications built with it. The term look and feel may be overused; nonetheless, a consistent and well designed look and feel assists the user in Teaming and using new applications. The term point and click generally refers to using a mouse to select program commands. While Motif supports point and click, the toolkit also supports using the keyboard as a substitute for many operations. This gives a good typist a distinct advantage when using a familiar application. We will give an overview of the toolkit, touching on the user interface features and general programming considerations. Since the source code for many useful Motif programs is readily available, we will explain how to get these sources and touch on derived benefits. We win also point to other sources of on-line help and documentation. Finally, we will present some practical experiences developing applications.

  3. Detecting seeded motifs in DNA sequences.

    PubMed

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST. PMID:16141193

  4. Detecting seeded motifs in DNA sequences

    PubMed Central

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at . PMID:16141193

  5. New type of starch-binding domain: the direct repeat motif in the C-terminal region of Bacillus sp. no. 195 alpha-amylase contributes to starch binding and raw starch degrading.

    PubMed

    Sumitani, J; Tottori, T; Kawaguchi, T; Arai, M

    2000-09-01

    The alpha-amylase from Bacillus sp. no. 195 (BAA) consists of two domains: one is the catalytic domain similar to alpha-amylases from animals and Streptomyces in the N-terminal region; the other is the functionally unknown domain composed of an approx. 90-residue direct repeat in the C-terminal region. The gene coding for BAA was expressed in Streptomyces lividans TK24. Three active forms of the gene products were found. The pH and thermal profiles of BAAs, and their catalytic activities for p-nitrophenyl maltopentaoside and soluble starch, showed almost the same behaviours. The largest, 69 kDa, form (BAA-alpha) was of the same molecular mass as that of the mature protein estimated from the nucleotide sequence, and had raw-starch-binding and -degrading abilities. The second largest, 60 kDa, form (BAA-beta), whose molecular mass was the same as that of the natural enzyme from Bacillus sp. no. 195, was generated by proteolytic processing between the two repeat sequences in the C-terminal region, and had lower activities for raw starch binding and degrading than those of BAA-alpha. The smallest, 50 kDa, form (BAA-gamma) contained only the N-terminal catalytic domain as a result of removal of the C-terminal repeat sequence, which led to loss of binding and degradation of insoluble starches. Thus the starch adsorption capacity and raw-starch-degrading activity of BAAs depends on the existence of the repeat sequence in the C-terminal region. BAA-alpha was specifically adsorbed on starch or dextran (alpha-1,4 or alpha-1,6 glucan), and specifically desorbed with maltose or beta-cyclodextrin. These observations indicated that the repeat sequence of the enzyme was functional in the starch-binding domain (SBD). We propose the designation of the homologues to the SBD of glucoamylase from Aspergillus niger as family I SBDs, the homologues to that of glucoamylase from Rhizopus oryzae as family II, and the homologues of this repeat sequence of BAA as family III. PMID:10947962

  6. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

    PubMed Central

    2012-01-01

    Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery

  7. Six and Eya promote apoptosis through direct transcriptional activation of the proapoptotic BH3-only gene egl-1 in Caenorhabditis elegans.

    PubMed

    Hirose, Takashi; Galvin, Brendan D; Horvitz, H Robert

    2010-08-31

    The decision of a cell to undergo programmed cell death is tightly regulated during animal development and tissue homeostasis. Here, we show that the Caenorhabditis elegans Six family homeodomain protein C. elegans homeobox (CEH-34) and the Eyes absent ortholog EYA-1 promote the programmed cell death of a specific pharyngeal neuron, the sister of the M4 motor neuron. Loss of either ceh-34 or eya-1 function causes survival of the M4 sister cell, which normally undergoes programmed cell death. CEH-34 physically interacts with the conserved EYA domain of EYA-1 in vitro. We identify an egl-1 5' cis-regulatory element that controls the programmed cell death of the M4 sister cell and show that CEH-34 binds directly to this site. Expression of the proapoptotic gene egl-1 in the M4 sister cell requires ceh-34 and eya-1 function. We conclude that an evolutionarily conserved complex that includes CEH-34 and EYA-1 directly activates egl-1 expression through a 5' cis-regulatory element to promote the programmed cell death of the M4 sister cell. We suggest that the regulation of apoptosis by Six and Eya family members is conserved in mammals and involved in human diseases caused by mutations in Six and Eya. PMID:20713707

  8. SVM2Motif--Reconstructing Overlapping DNA Sequence Motifs by Mimicking an SVM Predictor.

    PubMed

    Vidovic, Marina M-C; Görnitz, Nico; Müller, Klaus-Robert; Rätsch, Gunnar; Kloft, Marius

    2015-01-01

    Identifying discriminative motifs underlying the functionality and evolution of organisms is a major challenge in computational biology. Machine learning approaches such as support vector machines (SVMs) achieve state-of-the-art performances in genomic discrimination tasks, but--due to its black-box character--motifs underlying its decision function are largely unknown. As a remedy, positional oligomer importance matrices (POIMs) allow us to visualize the significance of position-specific subsequences. Although being a major step towards the explanation of trained SVM models, they suffer from the fact that their size grows exponentially in the length of the motif, which renders their manual inspection feasible only for comparably small motif sizes, typically k ≤ 5. In this work, we extend the work on positional oligomer importance matrices, by presenting a new machine-learning methodology, entitled motifPOIM, to extract the truly relevant motifs--regardless of their length and complexity--underlying the predictions of a trained SVM model. Our framework thereby considers the motifs as free parameters in a probabilistic model, a task which can be phrased as a non-convex optimization problem. The exponential dependence of the POIM size on the oligomer length poses a major numerical challenge, which we address by an efficient optimization framework that allows us to find possibly overlapping motifs consisting of up to hundreds of nucleotides. We demonstrate the efficacy of our approach on a synthetic data set as well as a real-world human splice site data set. PMID:26690911

  9. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

    PubMed Central

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls. PMID:27489856

  10. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

    PubMed

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls. PMID:27489856

  11. MEME Suite: tools for motif discovery and searching

    PubMed Central

    Bailey, Timothy L.; Boden, Mikael; Buske, Fabian A.; Frith, Martin; Grant, Charles E.; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W.; Noble, William S.

    2009-01-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net. PMID:19458158

  12. Calendar motifs on Getashen hydria

    NASA Astrophysics Data System (ADS)

    Vrtanesyan, Garegin

    2015-07-01

    Getashen hydria was found in the tombs of the middle bronze age (the first third of the second Millennium B.C.) in Armenia (Lake Sevan). It shows a scene consisting of three friezes. On the lower frieze depicts six zoomorphic figures, on an average six frieze waterfowl, and on top, is the graphic signs. Calendar motives of this composition have a numeric expression, six zoomorphic figures on the lower and middle friezes. Division of the annual cycle into two parts is known in the calendars of the ancient Indo-Iranian ("great summer" and "the great winter"). Animals on the lower frieze of the second mark, "winter" road of the Sun, because in this period are the most important events, ensuring the reproduction of the economy of the society. This rut ungulates - wild (deer) and domestic (goats). Moreover, the gon goats end in December, almost coinciding with the onset of the winter solstice. A couple of dogs on the lower frieze marks the version of the myth, imprisoned in the rock hero - the Sun (Mihr - Artavazd), to which his dogs have to chew the chains, anticipating his exit at the winter solstice. This is indicated by the direction of their movement, the Sun moves from left to right for an observer, only when located on the South side of the sky (i.e., beginning with the autumnal equinox). The most important event of the period of "summer road" of the Sun is the vernal equinox, which coincide with the arrival of waterfowl (ducks, geese). Their direction on the second frieze (left to right) corresponds to the position of the observer, facing North.

  13. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  14. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  15. The Molecular Evolution of the Qo Motif

    PubMed Central

    Kao, Wei-Chun; Hunte, Carola

    2014-01-01

    Quinol oxidation in the catalytic quinol oxidation site (Qo site) of cytochrome (cyt) bc1 complexes is the key step of the Q cycle mechanism, which laid the ground for Mitchell’s chemiosmotic theory of energy conversion. Bifurcated electron transfer upon quinol oxidation enables proton uptake and release on opposite membrane sides, thus generating a proton gradient that fuels ATP synthesis in cellular respiration and photosynthesis. The Qo site architecture formed by cyt b and Rieske iron–sulfur protein (ISP) impedes harmful bypass reactions. Catalytic importance is assigned to four residues of cyt b formerly described as PEWY motif in the context of mitochondrial complexes, which we now denominate Qo motif as comprehensive evolutionary sequence analysis of cyt b shows substantial natural variance of the motif with phylogenetically specific patterns. In particular, the Qo motif is identified as PEWY in mitochondria, α- and ε-Proteobacteria, Aquificae, Chlorobi, Cyanobacteria, and chloroplasts. PDWY is present in Gram-positive bacteria, Deinococcus–Thermus and haloarchaea, and PVWY in β- and γ-Proteobacteria. PPWF only exists in Archaea. Distinct patterns for acidophilic organisms indicate environment-specific adaptations. Importantly, the presence of PDWY and PEWY is correlated with the redox potential of Rieske ISP and quinone species. We propose that during evolution from low to high potential electron-transfer systems in the emerging oxygenic atmosphere, cyt bc1 complexes with PEWY as Qo motif prevailed to efficiently use high potential ubiquinone as substrate, whereas cyt b with PDWY operate best with low potential Rieske ISP and menaquinone, with the latter being the likely composition of the ancestral cyt bc1 complex. PMID:25115012

  16. DNA motif elucidation using belief propagation.

    PubMed

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-09-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM. PMID:23814189

  17. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  18. The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery

    PubMed Central

    Parry, Trevor J.; Theisen, Joshua W.M.; Hsu, Jer-Yuan; Wang, Yuan-Liang; Corcoran, David L.; Eustice, Moriah; Ohler, Uwe; Kadonaga, James T.

    2010-01-01

    The TCT motif (polypyrimidine initiator) encompasses the transcription start site of nearly all ribosomal protein genes in Drosophila and mammals. The TCT motif is required for transcription of ribosomal protein gene promoters. The TCT element resembles the Inr (initiator), but is not recognized by TFIID and cannot function in lieu of an Inr. However, a single T-to-A substitution converts the TCT element into a functionally active Inr. Thus, the TCT motif is a novel transcriptional element that is distinct from the Inr. These findings reveal a specialized TCT-based transcription system that is directed toward the synthesis of ribosomal proteins. PMID:20801935

  19. Functional Motifs in Biochemical Reaction Networks

    PubMed Central

    Tyson, John J.; Novák, Béla

    2013-01-01

    The signal-response characteristics of a living cell are determined by complex networks of interacting genes, proteins, and metabolites. Understanding how cells respond to specific challenges, how these responses are contravened in diseased cells, and how to intervene pharmacologically in the decision-making processes of cells requires an accurate theory of the information-processing capabilities of macromolecular regulatory networks. Adopting an engineer’s approach to control systems, we ask whether realistic cellular control networks can be decomposed into simple regulatory motifs that carry out specific functions in a cell. We show that such functional motifs exist and review the experimental evidence that they control cellular responses as expected. PMID:20055671

  20. A Basic Set of Homeostatic Controller Motifs

    PubMed Central

    Drengstig, T.; Jolma, I.W.; Ni, X.Y.; Thorsen, K.; Xu, X.M.; Ruoff, P.

    2012-01-01

    Adaptation and homeostasis are essential properties of all living systems. However, our knowledge about the reaction kinetic mechanisms leading to robust homeostatic behavior in the presence of environmental perturbations is still poor. Here, we describe, and provide physiological examples of, a set of two-component controller motifs that show robust homeostasis. This basic set of controller motifs, which can be considered as complete, divides into two operational work modes, termed as inflow and outflow control. We show how controller combinations within a cell can integrate uptake and metabolization of a homeostatic controlled species and how pathways can be activated and lead to the formation of alternative products, as observed, for example, in the change of fermentation products by microorganisms when the supply of the carbon source is altered. The antagonistic character of hormonal control systems can be understood by a combination of inflow and outflow controllers. PMID:23199928

  1. Anticipated synchronization in neuronal network motifs

    NASA Astrophysics Data System (ADS)

    Matias, F. S.; Gollo, L. L.; Carelli, P. V.; Copelli, M.; Mirasso, C. R.

    2013-01-01

    Two identical dynamical systems coupled unidirectionally (in a so called master-slave configuration) exhibit anticipated synchronization (AS) if the one which receives the coupling (the slave) also receives a negative delayed self-feedback. In oscillatory neuronal systems AS is characterized by a phase-locking with negative time delay τ between the spikes of the master and of the slave (slave fires before the master), while in the usual delayed synchronization (DS) regime τ is positive (slave fires after the master). A 3-neuron motif in which the slave self-feedback is replaced by a feedback loop mediated by an interneuron can exhibits both AS and DS regimes. Here we show that AS is robust in the presence of noise in a 3 Hodgkin-Huxley type neuronal motif. We also show that AS is stable for large values of τ in a chain of connected slaves-interneurons.

  2. Motifs, modules and games in bacteria

    SciTech Connect

    Wolf, Denise M.; Arkin, Adam P.

    2003-04-01

    Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment. Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.

  3. Analyzing network reliability using structural motifs

    NASA Astrophysics Data System (ADS)

    Khorramzadeh, Yasamin; Youssef, Mina; Eubank, Stephen; Mowlaei, Shahir

    2015-04-01

    This paper uses the reliability polynomial, introduced by Moore and Shannon in 1956, to analyze the effect of network structure on diffusive dynamics such as the spread of infectious disease. We exhibit a representation for the reliability polynomial in terms of what we call structural motifs that is well suited for reasoning about the effect of a network's structural properties on diffusion across the network. We illustrate by deriving several general results relating graph structure to dynamical phenomena.

  4. Bioinformatics Approaches for Predicting Disordered Protein Motifs.

    PubMed

    Bhowmick, Pallab; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Short, linear motifs (SLiMs) in proteins are functional microdomains consisting of contiguous residue segments along the protein sequence, typically not more than 10 consecutive amino acids in length with less than 5 defined positions. Many positions are 'degenerate' thus offering flexibility in terms of the amino acid types allowed at those positions. Their short length and degenerate nature confers evolutionary plasticity meaning that SLiMs often evolve convergently. Further, SLiMs have a propensity to occur within intrinsically unstructured protein segments and this confers versatile functionality to unstructured regions of the proteome. SLiMs mediate multiple types of protein interactions based on domain-peptide recognition and guide functions including posttranslational modifications, subcellular localization of proteins, and ligand binding. SLiMs thus behave as modular interaction units that confer versatility to protein function and SLiM-mediated interactions are increasingly being recognized as therapeutic targets. In this chapter we start with a brief description about the properties of SLiMs and their interactions and then move on to discuss algorithms and tools including several web-based methods that enable the discovery of novel SLiMs (de novo motif discovery) as well as the prediction of novel occurrences of known SLiMs. Both individual amino acid sequences as well as sets of protein sequences can be scanned using these methods to obtain statistically overrepresented sequence patterns. Lists of putatively functional SLiMs are then assembled based on parameters such as evolutionary sequence conservation, disorder scores, structural data, gene ontology terms and other contextual information that helps to assess the functional credibility or significance of these motifs. These bioinformatics methods should certainly guide experiments aimed at motif discovery. PMID:26387106

  5. [Conserved motifs in voltage sensing proteins].

    PubMed

    Wang, Chang-He; Xie, Zhen-Li; Lv, Jian-Wei; Yu, Zhi-Dan; Shao, Shu-Li

    2012-08-25

    This paper was aimed to study conserved motifs of voltage sensing proteins (VSPs) and establish a voltage sensing model. All VSPs were collected from the Uniprot database using a comprehensive keyword search followed by manual curation, and the results indicated that there are only two types of known VSPs, voltage gated ion channels and voltage dependent phosphatases. All the VSPs have a common domain of four helical transmembrane segments (TMS, S1-S4), which constitute the voltage sensing module of the VSPs. The S1 segment was shown to be responsible for membrane targeting and insertion of these proteins, while S2-S4 segments, which can sense membrane potential, for protein properties. Conserved motifs/residues and their functional significance of each TMS were identified using profile-to-profile sequence alignments. Conserved motifs in these four segments are strikingly similar for all VSPs, especially, the conserved motif [RK]-X(2)-R-X(2)-R-X(2)-[RK] was presented in all the S4 segments, with positively charged arginine (R) alternating with two hydrophobic or uncharged residues. Movement of these arginines across the membrane electric field is the core mechanism by which the VSPs detect changes in membrane potential. The negatively charged aspartate (D) in the S3 segment is universally conserved in all the VSPs, suggesting that the aspartate residue may be involved in voltage sensing properties of VSPs as well as the electrostatic interactions with the positively charged residues in the S4 segment, which may enhance the thermodynamic stability of the S4 segments in plasma membrane. PMID:22907298

  6. The Geometry of Plasticity-Induced Sensitization in Isoinhibitory Rate Motifs.

    PubMed

    Kumar, Gautam; Ching, ShiNung

    2016-09-01

    A well-known phenomenon in sensory perception is desensitization, wherein behavioral responses to persistent stimuli become attenuated over time. In this letter, our focus is on studying mechanisms through which desensitization may be mediated at the network level and, specifically, how sensitivity changes arise as a function of long-term plasticity. Our principal object of study is a generic isoinhibitory motif: a small excitatory-inhibitory network with recurrent inhibition. Such a motif is of interest due to its overrepresentation in laminar sensory network architectures. Here, we introduce a sensitivity analysis derived from control theory in which we characterize the fixed-energy reachable set of the motif. This set describes the regions of the phase-space that are more easily (in terms of stimulus energy) accessed, thus providing a holistic assessment of sensitivity. We specifically focus on how the geometry of this set changes due to repetitive application of a persistent stimulus. We find that for certain motif dynamics, this geometry contracts along the stimulus orientation while expanding in orthogonal directions. In other words, the motif not only desensitizes to the persistent input, but heightens its responsiveness (sensitizes) to those that are orthogonal. We develop a perturbation analysis that links this sensitization to both plasticity-induced changes in synaptic weights and the intrinsic dynamics of the network, highlighting that the effect is not purely due to weight-dependent disinhibition. Instead, this effect depends on the relative neuronal time constants and the consequent stimulus-induced drift that arises in the motif phase-space. For tightly distributed (but random) parameter ranges, sensitization is quite generic and manifests in larger recurrent E-I networks within which the motif is embedded. PMID:27391684

  7. Dynamic motifs in socio-economic networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

    2014-12-01

    Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.

  8. Three step synthesis of single diastereoisomers of the vicinal trifluoro motif

    PubMed Central

    Brunet, Vincent A; Slawin, Alexandra M Z

    2009-01-01

    Summary A three step route to single diastereoisomers of the vicinal trifluoromethyl motif is described. The route starts from either syn- or anti-α,β-epoxy alcohols and takes a direct approach in that each of the three steps introduces a fluorine atom in a regio- and stereo-specific manner. Starting from either the syn- or the anti-α,β-epoxy alcohol, stereospecific reactions generate two separate diastereoisomeric series of this motif. The route is a significant improvement on an earlier six step strategy. PMID:20300509

  9. A motif for reversible nitric oxide interactions in metalloenzymes.

    PubMed

    Zhang, Shiyu; Melzer, Marie M; Sen, S Nermin; Çelebi-Ölçüm, Nihan; Warren, Timothy H

    2016-07-01

    Nitric oxide (NO) participates in numerous biological processes, such as signalling in the respiratory system and vasodilation in the cardiovascular system. Many metal-mediated processes involve direct reaction of NO to form a metal-nitrosyl (M-NO), as occurs at the Fe(2+) centres of soluble guanylate cyclase or cytochrome c oxidase. However, some copper electron-transfer proteins that bear a type 1 Cu site (His2Cu-Cys) reversibly bind NO by an unknown motif. Here, we use model complexes of type 1 Cu sites based on tris(pyrazolyl)borate copper thiolates [Cu(II)]-SR to unravel the factors involved in NO reactivity. Addition of NO provides the fully characterized S-nitrosothiol adduct [Cu(I)](κ(1)-N(O)SR), which reversibly loses NO on purging with an inert gas. Computational analysis outlines a low-barrier pathway for the capture and release of NO. These findings suggest a new motif for reversible binding of NO at bioinorganic metal centres that can interconvert NO and RSNO molecular signals at copper sites. PMID:27325092

  10. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations. PMID:12614545

  11. ET-Motif: Solving the Exact (l, d)-Planted Motif Problem Using Error Tree Structure.

    PubMed

    Al-Okaily, Anas; Huang, Chun-Hsi

    2016-07-01

    Motif finding is an important and a challenging problem in many biological applications such as discovering promoters, enhancers, locus control regions, transcription factors, and more. The (l, d)-planted motif search, PMS, is one of several variations of the problem. In this problem, there are n given sequences over alphabets of size [Formula: see text], each of length m, and two given integers l and d. The problem is to find a motif m of length l, where in each sequence there is at least an l-mer at a Hamming distance of [Formula: see text] of m. In this article, we propose ET-Motif, an algorithm that can solve the PMS problem in [Formula: see text] time and [Formula: see text] space. The time bound can be further reduced by a factor of m with [Formula: see text] space. In case the suffix tree that is built for the input sequences is balanced, the problem can be solved in [Formula: see text] time and [Formula: see text] space. Similarly, the time bound can be reduced by a factor of m using [Formula: see text] space. Moreover, the variations of the problem, namely the edit distance PMS and edited PMS (Quorum), can be solved using ET-Motif with simple modifications but upper bands of space and time. For edit distance PMS, the time and space bounds will be increased by [Formula: see text], while for edited PMS the increase will be of [Formula: see text] in the time bound. PMID:27152692

  12. Mechano-chemical selections of two competitive unfolding pathways of a single DNA i-motif

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Chen, Hu; Qu, Yu-Jie; Artem, K. Efremov; Li, Ming; Ouyang, Zhong-Can; Liu, Dong-Sheng; Yan, Jie

    2014-06-01

    The DNA i-motif is a quadruplex structure formed in tandem cytosine-rich sequences in slightly acidic conditions. Besides being considered as a building block of DNA nano-devices, it may also play potential roles in regulating chromosome stability and gene transcriptions. The stability of i-motif is crucial for these functions. In this work, we investigated the mechanical stability of a single i-motif formed in the human telomeric sequence 5'-(CCCTAA)3CCC, which revealed a novel pH and loading rate-dependent bimodal unfolding force distribution. Although the cause of the bimodal unfolding force species is not clear, we proposed a phenomenological model involving a direct unfolding favored at lower loading rate or higher pH value, which is subject to competition with another unfolding pathway through a mechanically stable intermediate state whose nature is yet to be determined. Overall, the unique mechano—chemical responses of i-motif-provide a new perspective to its stability, which may be useful to guide designing new i-motif-based DNA mechanical nano-devices.

  13. CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design

    PubMed Central

    Chen, Yong

    2016-01-01

    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html. PMID:27487245

  14. CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.

    PubMed

    Zhang, Shaoqiang; Chen, Yong

    2016-01-01

    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html. PMID:27487245

  15. Functional Analysis of Semi-conserved Transit Peptide Motifs and Mechanistic Implications in Precursor Targeting and Recognition.

    PubMed

    Holbrook, Kristen; Subramanian, Chitra; Chotewutmontri, Prakitchai; Reddick, L Evan; Wright, Sarah; Zhang, Huixia; Moncrief, Lily; Bruce, Barry D

    2016-09-01

    Over 95% of plastid proteins are nuclear-encoded as their precursors containing an N-terminal extension known as the transit peptide (TP). Although highly variable, TPs direct the precursors through a conserved, posttranslational mechanism involving translocons in the outer (TOC) and inner envelope (TOC). The organelle import specificity is mediated by one or more components of the Toc complex. However, the high TP diversity creates a paradox on how the sequences can be specifically recognized. An emerging model of TP design is that they contain multiple loosely conserved motifs that are recognized at different steps in the targeting and transport process. Bioinformatics has demonstrated that many TPs contain semi-conserved physicochemical motifs, termed FGLK. In order to characterize FGLK motifs in TP recognition and import, we have analyzed two well-studied TPs from the precursor of RuBisCO small subunit (SStp) and ferredoxin (Fdtp). Both SStp and Fdtp contain two FGLK motifs. Analysis of large set mutations (∼85) in these two motifs using in vitro, in organello, and in vivo approaches support a model in which the FGLK domains mediate interaction with TOC34 and possibly other TOC components. In vivo import analysis suggests that multiple FGLK motifs are functionally redundant. Furthermore, we discuss how FGLK motifs are required for efficient precursor protein import and how these elements may permit a convergent function of this highly variable class of targeting sequences. PMID:27378725

  16. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme–pRNA intermediate formation

    PubMed Central

    Neubauer, Julie; Ogino, Minako; Green, Todd J.; Ogino, Tomoaki

    2016-01-01

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3–8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop–start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs. PMID:26602696

  17. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme-pRNA intermediate formation.

    PubMed

    Neubauer, Julie; Ogino, Minako; Green, Todd J; Ogino, Tomoaki

    2016-01-01

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3-8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop-start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs. PMID:26602696

  18. The RNA 3D Motif Atlas: Computational methods for extraction, organization and evaluation of RNA motifs.

    PubMed

    Parlea, Lorena G; Sweeney, Blake A; Hosseini-Asanjan, Maryam; Zirbel, Craig L; Leontis, Neocles B

    2016-07-01

    RNA 3D motifs occupy places in structured RNA molecules that correspond to the hairpin, internal and multi-helix junction "loops" of their secondary structure representations. As many as 40% of the nucleotides of an RNA molecule can belong to these structural elements, which are distinct from the regular double helical regions formed by contiguous AU, GC, and GU Watson-Crick basepairs. With the large number of atomic- or near atomic-resolution 3D structures appearing in a steady stream in the PDB/NDB structure databases, the automated identification, extraction, comparison, clustering and visualization of these structural elements presents an opportunity to enhance RNA science. Three broad applications are: (1) identification of modular, autonomous structural units for RNA nanotechnology, nanobiology and synthetic biology applications; (2) bioinformatic analysis to improve RNA 3D structure prediction from sequence; and (3) creation of searchable databases for exploring the binding specificities, structural flexibility, and dynamics of these RNA elements. In this contribution, we review methods developed for computational extraction of hairpin and internal loop motifs from a non-redundant set of high-quality RNA 3D structures. We provide a statistical summary of the extracted hairpin and internal loop motifs in the most recent version of the RNA 3D Motif Atlas. We also explore the reliability and accuracy of the extraction process by examining its performance in clustering recurrent motifs from homologous ribosomal RNA (rRNA) structures. We conclude with a summary of remaining challenges, especially with regard to extraction of multi-helix junction motifs. PMID:27125735

  19. Transcription factor motif quality assessment requires systematic comparative analysis

    PubMed Central

    Kibet, Caleb Kipkurui; Machanick, Philip

    2016-01-01

    Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. PMID:27092243

  20. RMOD: a tool for regulatory motif detection in signaling network.

    PubMed

    Kim, Jinki; Yi, Gwan-Su

    2013-01-01

    Regulatory motifs are patterns of activation and inhibition that appear repeatedly in various signaling networks and that show specific regulatory properties. However, the network structures of regulatory motifs are highly diverse and complex, rendering their identification difficult. Here, we present a RMOD, a web-based system for the identification of regulatory motifs and their properties in signaling networks. RMOD finds various network structures of regulatory motifs by compressing the signaling network and detecting the compressed forms of regulatory motifs. To apply it into a large-scale signaling network, it adopts a new subgraph search algorithm using a novel data structure called path-tree, which is a tree structure composed of isomorphic graphs of query regulatory motifs. This algorithm was evaluated using various sizes of signaling networks generated from the integration of various human signaling pathways and it showed that the speed and scalability of this algorithm outperforms those of other algorithms. RMOD includes interactive analysis and auxiliary tools that make it possible to manipulate the whole processes from building signaling network and query regulatory motifs to analyzing regulatory motifs with graphical illustration and summarized descriptions. As a result, RMOD provides an integrated view of the regulatory motifs and mechanism underlying their regulatory motif activities within the signaling network. RMOD is freely accessible online at the following URL: http://pks.kaist.ac.kr/rmod. PMID:23874612

  1. Structural motifs and the stability of fullerenes

    SciTech Connect

    Austin, S.J.; Fowler, P.W.; Manolopoulos, D.E.; Orlandi, G.; Zerbetto, F.

    1995-05-18

    Full geometry optimization has been performed within the semiempirical QCFF/PI model for the 1812 fullerene structural isomers of C{sub 60} formed by 12 pentagons and 20 hexagons. All are local minima on the potential energy hypersurface. Correlations of total energy with many structural motifs yield highly scattered diagrams, but some exhibit linear trends. Penalty and merit functions can be assigned to certain motifs: inclusion of a fused pentagon pair entails an average penalty of 111 kJ mol{sup -1}; a generic hexagon triple costs 23 kJ mol{sup -1}; a triple (open or fused) comprising a pentagon between two hexagonal neighbors gives a stabilization of 19 kJ mol{sup -1}. These results can be understood in terms of the curved nature of fullerene molecules: pentagons should be isolated to avoid sharp local curvature, hexagon triples are costly because they enforce local planarity and hence imply high curvature in another part of the fullerene surface, but hexagon-pentagon-hexagon triples allow the surface to distribute steric strain by warping. The best linear fit is found for H, the second moment of the hexagon-neighbor-index signature, which fits the total energies with a standard deviation of only 53 kJ mol{sup -1} and must be minimized for stability; this index too can be interpreted in terms of curvature. 26 refs., 5 figs.

  2. Structural motifs of pre-nucleation clusters.

    PubMed

    Zhang, Y; Türkmen, I R; Wassermann, B; Erko, A; Rühl, E

    2013-10-01

    Structural motifs of pre-nucleation clusters prepared in single, optically levitated supersaturated aqueous aerosol microparticles containing CaBr2 as a model system are reported. Cluster formation is identified by means of X-ray absorption in the Br K-edge regime. The salt concentration beyond the saturation point is varied by controlling the humidity in the ambient atmosphere surrounding the 15-30 μm microdroplets. This leads to the formation of metastable supersaturated liquid particles. Distinct spectral shifts in near-edge spectra as a function of salt concentration are observed, in which the energy position of the Br K-edge is red-shifted by up to 7.1 ± 0.4 eV if the dilute solution is compared to the solid. The K-edge positions of supersaturated solutions are found between these limits. The changes in electronic structure are rationalized in terms of the formation of pre-nucleation clusters. This assumption is verified by spectral simulations using first-principle density functional theory and molecular dynamics calculations, in which structural motifs are considered, explaining the experimental results. These consist of solvated CaBr2 moieties, rather than building blocks forming calcium bromide hexahydrates, the crystal system that is formed by drying aqueous CaBr2 solutions. PMID:24116574

  3. Network Motifs: Simple Building Blocks of Complex Networks

    NASA Astrophysics Data System (ADS)

    Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; Alon, U.

    2002-10-01

    Complex networks are studied across many fields of science. To uncover their structural design principles, we defined ``network motifs,'' patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks. We found such motifs in networks from biochemistry, neurobiology, ecology, and engineering. The motifs shared by ecological food webs were distinct from the motifs shared by the genetic networks of Escherichia coli and Saccharomyces cerevisiae or from those found in the World Wide Web. Similar motifs were found in networks that perform information processing, even though they describe elements as different as biomolecules within a cell and synaptic connections between neurons in Caenorhabditis elegans. Motifs may thus define universal classes of networks. This approach may uncover the basic building blocks of most networks.

  4. A Gibbs sampler for motif detection in phylogenetically close sequences

    NASA Astrophysics Data System (ADS)

    Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

    2004-03-01

    Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.

  5. Detecting DNA regulatory motifs by incorporating positional trendsin information content

    SciTech Connect

    Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.; Eisen,Michael B.

    2004-05-04

    On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.

  6. Ballast: A Ball-based Algorithm for Structural Motifs

    PubMed Central

    He, Lu; Vandin, Fabio; Pandurangan, Gopal

    2013-01-01

    Abstract Structural motifs encapsulate local sequence-structure-function relationships characteristic of related proteins, enabling the prediction of functional characteristics of new proteins, providing molecular-level insights into how those functions are performed, and supporting the development of variants specifically maintaining or perturbing function in concert with other properties. Numerous computational methods have been developed to search through databases of structures for instances of specified motifs. However, it remains an open problem how best to leverage the local geometric and chemical constraints underlying structural motifs in order to develop motif-finding algorithms that are both theoretically and practically efficient. We present a simple, general, efficient approach, called Ballast (ball-based algorithm for structural motifs), to match given structural motifs to given structures. Ballast combines the best properties of previously developed methods, exploiting the composition and local geometry of a structural motif and its possible instances in order to effectively filter candidate matches. We show that on a wide range of motif-matching problems, Ballast efficiently and effectively finds good matches, and we provide theoretical insights into why it works well. By supporting generic measures of compositional and geometric similarity, Ballast provides a powerful substrate for the development of motif-matching algorithms. PMID:23383999

  7. Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

    PubMed Central

    Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

    1995-01-01

    The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488

  8. Computational study enlightens the structural role of the alcohol acyltransferase DFGWG motif.

    PubMed

    Morales-Quintana, Luis; Moya-León, María Alejandra; Herrera, Raúl

    2015-08-01

    Alcohol acyltransferases (AAT) catalyze the esterification reaction of alcohols and acyl-CoA into esters in fruits and flowers. Despite the high divergence between AAT enzymes, two important and conserved motifs are shared: the catalytic HxxxD motif, and the DFGWG motif. The latter is proposed to play a structural role; however, its function remains unclear. The DFGWG motif is located in loop 21 and stabilized by a hydrogen bond between residues Y52 and D381. Also, this motif is distant from the HxxxD motif, and most probably without a direct role in the substrate interaction. To evaluate the role of the DFGWG motif, in silico analysis was performed in the VpAAT1 protein. Three mutants (Y52F, D381A and D381E) were evaluated. Major changes (size and shape) in the solvent channels were found, although no differences were revealed in the entire 3D structure. Molecular dynamics simulations and docking studies described unfavorable energies for interaction of the mutant proteins with different substrates, as well as unfavored ligand orientations in the solvent channel. Additionally, we examined the contribution of different energetic parameters to the total free energy of protein-ligand complexes by the MM-GBSA method. The complexes differed mainly in their van der Waals contributions and have unfavorable electrostatic interactions. VpAAT1, Y52F and D381A mutants showed a dramatic reduction in the binding capacity to several substrates, which is related to differences in electrostatic potential on the protein surfaces, suggesting that D381 from the DFGWG motif and residue Y52 play a crucial role in maintenance of the adequate solvent channel structure required for catalysis. Graphical abstract Molecular docking, molecular dynamics (MD) simulations and MM-GBSA free energy calculations were employed to obtain quantitative estimates for the binding free energies of wild type Vasconcellea pubescens alcohol acyltransferase (VpAAT1-WT) and the protein mutants. Left VpAAT1

  9. Plasticity of the RNA Kink Turn Structural Motif

    SciTech Connect

    Antonioli, A.; Cochrane, J; Lipchock, S; Strobel, S

    2010-01-01

    The kink turn (K-turn) is an RNA structural motif found in many biologically significant RNAs. While most examples of the K-turn have a similar fold, the crystal structure of the Azoarcus group I intron revealed a novel RNA conformation, a reverse kink turn bent in the direction opposite that of a consensus K-turn. The reverse K-turn is bent toward the major grooves rather than the minor grooves of the flanking helices, yet the sequence differs from the K-turn consensus by only a single nucleotide. Here we demonstrate that the reverse bend direction is not solely defined by internal sequence elements, but is instead affected by structural elements external to the K-turn. It bends toward the major groove under the direction of a tetraloop-tetraloop receptor. The ability of one sequence to form two distinct structures demonstrates the inherent plasticity of the K-turn sequence. Such plasticity suggests that the K-turn is not a primary element in RNA folding, but instead is shaped by other structural elements within the RNA or ribonucleoprotein assembly.

  10. Invisible RNA state dynamically couples distant motifs

    PubMed Central

    Lee, Janghyun; Dethoff, Elizabeth A.; Al-Hashimi, Hashim M.

    2014-01-01

    Using on- and off-resonance carbon and nitrogen R1ρ NMR relaxation dispersion in concert with mutagenesis and NMR chemical shift fingerprinting, we show that the transactivation response element RNA from the HIV-1 exists in dynamic equilibrium with a transient state that has a lifetime of ∼2 ms and population of ∼0.4%, which simultaneously remodels the structure of a bulge, stem, and apical loop. This is accomplished by a global change in strand register, in which bulge residues pair up with residues in the upper stem, causing a reshuffling of base pairs that propagates to the tip of apical loop, resulting in the creation of three noncanonical base pairs. Our results show that transient states can remodel distant RNA motifs and possibly give rise to mechanisms for rapid long-range communication in RNA that can be harnessed in processes such as cooperative folding and ribonucleoprotein assembly. PMID:24979799

  11. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  12. Encoded Expansion: An Efficient Algorithm to Discover Identical String Motifs

    PubMed Central

    Azmi, Aqil M.; Al-Ssulami, Abdulrakeeb

    2014-01-01

    A major task in computational biology is the discovery of short recurring string patterns known as motifs. Most of the schemes to discover motifs are either stochastic or combinatorial in nature. Stochastic approaches do not guarantee finding the correct motifs, while the combinatorial schemes tend to have an exponential time complexity with respect to motif length. To alleviate the cost, the combinatorial approach exploits dynamic data structures such as trees or graphs. Recently (Karci (2009) Efficient automatic exact motif discovery algorithms for biological sequences, Expert Systems with Applications 36:7952–7963) devised a deterministic algorithm that finds all the identical copies of string motifs of all sizes in theoretical time complexity of and a space complexity of where is the length of the input sequence and is the length of the longest possible string motif. In this paper, we present a significant improvement on Karci's original algorithm. The algorithm that we propose reports all identical string motifs of sizes that occur at least times. Our algorithm starts with string motifs of size 2, and at each iteration it expands the candidate string motifs by one symbol throwing out those that occur less than times in the entire input sequence. We use a simple array and data encoding to achieve theoretical worst-case time complexity of and a space complexity of Encoding of the substrings can speed up the process of comparison between string motifs. Experimental results on random and real biological sequences confirm that our algorithm has indeed a linear time complexity and it is more scalable in terms of sequence length than the existing algorithms. PMID:24871320

  13. The Q Motif Is Involved in DNA Binding but Not ATP Binding in ChlR1 Helicase

    PubMed Central

    Ding, Hao; Guo, Manhong; Vidhyasagar, Venkatasubramanian; Talwar, Tanu; Wu, Yuliang

    2015-01-01

    Helicases are molecular motors that couple the energy of ATP hydrolysis to the unwinding of structured DNA or RNA and chromatin remodeling. The conversion of energy derived from ATP hydrolysis into unwinding and remodeling is coordinated by seven sequence motifs (I, Ia, II, III, IV, V, and VI). The Q motif, consisting of nine amino acids (GFXXPXPIQ) with an invariant glutamine (Q) residue, has been identified in some, but not all helicases. Compared to the seven well-recognized conserved helicase motifs, the role of the Q motif is less acknowledged. Mutations in the human ChlR1 (DDX11) gene are associated with a unique genetic disorder known as Warsaw Breakage Syndrome, which is characterized by cellular defects in genome maintenance. To examine the roles of the Q motif in ChlR1 helicase, we performed site directed mutagenesis of glutamine to alanine at residue 23 in the Q motif of ChlR1. ChlR1 recombinant protein was overexpressed and purified from HEK293T cells. ChlR1-Q23A mutant abolished the helicase activity of ChlR1 and displayed reduced DNA binding ability. The mutant showed impaired ATPase activity but normal ATP binding. A thermal shift assay revealed that ChlR1-Q23A has a melting point value similar to ChlR1-WT. Partial proteolysis mapping demonstrated that ChlR1-WT and Q23A have a similar globular structure, although some subtle conformational differences in these two proteins are evident. Finally, we found ChlR1 exists and functions as a monomer in solution, which is different from FANCJ, in which the Q motif is involved in protein dimerization. Taken together, our results suggest that the Q motif is involved in DNA binding but not ATP binding in ChlR1 helicase. PMID:26474416

  14. Crystal structure of SEL1L: Insight into the roles of SLR motifs in ERAD pathway

    PubMed Central

    Jeong, Hanbin; Sim, Hyo Jung; Song, Eun Kyung; Lee, Hakbong; Ha, Sung Chul; Jun, Youngsoo; Park, Tae Joo; Lee, Changwook

    2016-01-01

    Terminally misfolded proteins are selectively recognized and cleared by the endoplasmic reticulum-associated degradation (ERAD) pathway. SEL1L, a component of the ERAD machinery, plays an important role in selecting and transporting ERAD substrates for degradation. We have determined the crystal structure of the mouse SEL1L central domain comprising five Sel1-Like Repeats (SLR motifs 5 to 9; hereafter called SEL1Lcent). Strikingly, SEL1Lcent forms a homodimer with two-fold symmetry in a head-to-tail manner. Particularly, the SLR motif 9 plays an important role in dimer formation by adopting a domain-swapped structure and providing an extensive dimeric interface. We identified that the full-length SEL1L forms a self-oligomer through the SEL1Lcent domain in mammalian cells. Furthermore, we discovered that the SLR-C, comprising SLR motifs 10 and 11, of SEL1L directly interacts with the N-terminus luminal loops of HRD1. Therefore, we propose that certain SLR motifs of SEL1L play a unique role in membrane bound ERAD machinery. PMID:27064360

  15. Cellular microRNAs up-regulate transcription via interaction with promoter TATA-box motifs

    PubMed Central

    Zhang, Yijun; Fan, Miaomiao; Zhang, Xue; Huang, Feng; Wu, Kang; Zhang, Junsong; Liu, Jun; Huang, Zhuoqiong; Luo, Haihua; Tao, Liang; Zhang, Hui

    2014-01-01

    The TATA box represents one of the most prevalent core promoters where the pre-initiation complexes (PICs) for gene transcription are assembled. This assembly is crucial for transcription initiation and well regulated. Here we show that some cellular microRNAs (miRNAs) are associated with RNA polymerase II (Pol II) and TATA box-binding protein (TBP) in human peripheral blood mononuclear cells (PBMCs). Among them, let-7i sequence specifically binds to the TATA-box motif of interleukin-2 (IL-2) gene and elevates IL-2 mRNA and protein production in CD4+ T-lymphocytes in vitro and in vivo. Through direct interaction with the TATA-box motif, let-7i facilitates the PIC assembly and transcription initiation of IL-2 promoter. Several other cellular miRNAs, such as mir-138, mir-92a or mir-181d, also enhance the promoter activities via binding to the TATA-box motifs of insulin, calcitonin or c-myc, respectively. In agreement with the finding that an HIV-1–encoded miRNA could enhance viral replication through targeting the viral promoter TATA-box motif, our data demonstrate that the interaction with core transcription machinery is a novel mechanism for miRNAs to regulate gene expression. PMID:25336585

  16. Coupling caspase cleavage and proteasomal degradation of proteins carrying PEST motif.

    PubMed

    Belizario, José E; Alves, Juliano; Garay-Malpartida, Miguel; Occhiucci, João Marcelo

    2008-06-01

    The degradation is critical to activation and deactivation of regulatory proteins involved in signaling pathways to cell growth, differentiation, stress responses and physiological cell death. Proteins carry domains and sequence motifs that function as prerequisite for their proteolysis by either individual proteases or the 26S multicomplex proteasomes. Two models for entry of substrates into the proteasomes have been considered. In one model, it is proposed that the ubiquitin chain attached to the protein serves as recognition element to drag them into the 19S regulatory particle, which promotes the unfolding required to its access into the 20S catalytic chamber. In second model, it is proposed that an unstructured tail located at amino or carboxyl terminus directly track proteins into the 26S/20S proteasomes. Caspases are cysteinyl aspartate proteases that control diverse signaling pathways, promoting the cleavage at one or two sites of hundreds of structural and regulatory protein substrates. Caspase cleavage sites are commonly found within PEST motifs, which are segments rich in proline (P), glutamic acid (D), aspartic acid (E) and serine (S) or threonine (T) residues. Considering that N- and C- terminal peptide carrying PEST motifs form disordered loops in the globular proteins after caspase cleavage, it is postulated here that these exposed termini serve as unstructured initiation site, coupling caspase cleavage and ubiquitin-proteasome dependent and independent degradation of short-lived proteins. This could explain the inherent susceptibility to proteolysis among proteins containing PEST motif. PMID:18537676

  17. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  18. Stochastic EM-based TFBS motif discovery with MITSU

    PubMed Central

    Kilpatrick, Alastair M.; Ward, Bruce; Aitken, Stuart

    2014-01-01

    Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: a.m.kilpatrick@sms.ed.ac.uk PMID:24931999

  19. Identifying novel sequence variants of RNA 3D motifs.

    PubMed

    Zirbel, Craig L; Roll, James; Sweeney, Blake A; Petrov, Anton I; Pirrung, Meg; Leontis, Neocles B

    2015-09-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  20. The phenomenon of astral motifs on late mediaeval tombstones

    NASA Astrophysics Data System (ADS)

    Mijatović, V.; Ninković, S.; Vemić, D.

    2003-10-01

    The authors study astral motifs present on some mediaeval tombstones found in present-day Serbia and Montenegro and in the neighbouring countries (especially in Bosnia and Herzegovina). The authors discern some important astral motifs, explain them and present a short review concerning their frequency.

  1. DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, WHICH REFERS TO THE QUICK PASSAGE OF TIME AND THE SHORTNESS OF HUMAN LIFE. USE OF THIS MOTIF WAS A CARRYOVER FROM THE MCARTHUR GATES. - Woodlands Cemetery, 4000 Woodlands Avenue, Philadelphia, Philadelphia County, PA

  2. Role of GxxxG Motifs in Transmembrane Domain Interactions.

    PubMed

    Teese, Mark G; Langosch, Dieter

    2015-08-25

    Transmembrane (TM) helices of integral membrane proteins can facilitate strong and specific noncovalent protein-protein interactions. Mutagenesis and structural analyses have revealed numerous examples in which the interaction between TM helices of single-pass membrane proteins is dependent on a GxxxG or (small)xxx(small) motif. It is therefore tempting to use the presence of these simple motifs as an indicator of TM helix interactions. In this Current Topic review, we point out that these motifs are quite common, with more than 50% of single-pass TM domains containing a (small)xxx(small) motif. However, the actual interaction strength of motif-containing helices depends strongly on sequence context and membrane properties. In addition, recent studies have revealed several GxxxG-containing TM domains that interact via alternative interfaces involving hydrophobic, polar, aromatic, or even ionizable residues that do not form recognizable motifs. In multipass membrane proteins, GxxxG motifs can be important for protein folding, and not just oligomerization. Our current knowledge thus suggests that the presence of a GxxxG motif alone is a weak predictor of protein dimerization in the membrane. PMID:26244771

  3. Differences in local genomic context of bound and unbound motifs

    PubMed Central

    Hansen, Loren; Mariño-Ramírez, Leonardo; Landsman, David

    2012-01-01

    Understanding gene regulation is a major objective in molecular biology research. Frequently, transcription is driven by transcription factors (TFs) that bind to specific DNA sequences. These motifs are usually short and degenerate, rendering the likelihood of multiple copies occurring throughout the genome due to random chance as high. Despite this, TFs only bind to a small subset of sites, thus prompting our investigation into the differences between motifs that are bound by TFs and those that remain unbound. Here we constructed vectors representing various chromatin- and sequence-based features for a published set of bound and unbound motifs representing nine TFs in the budding yeast Saccharomyces cerevisiae. Using a machine learning approach, we identified a set of features that can be used to discriminate between bound and unbound motifs. We also discovered that some TFs bind most or all of their strong motifs in intergenic regions. Our data demonstrate that local sequence context can be strikingly different around motifs that are bound compared to motifs that are unbound. We concluded that there are multiple combinations of genomic features that characterize bound or unbound motifs. PMID:22692006

  4. ELM: the status of the 2010 eukaryotic linear motif resource

    PubMed Central

    Gould, Cathryn M.; Diella, Francesca; Via, Allegra; Puntervoll, Pål; Gemünd, Christine; Chabanis-Davidson, Sophie; Michael, Sushama; Sayadi, Ahmed; Bryne, Jan Christian; Chica, Claudia; Seiler, Markus; Davey, Norman E.; Haslam, Niall; Weatheritt, Robert J.; Budd, Aidan; Hughes, Tim; Paś, Jakub; Rychlewski, Leszek; Travé, Gilles; Aasland, Rein; Helmer-Citterich, Manuela; Linding, Rune; Gibson, Toby J.

    2010-01-01

    Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a ‘Bar Code’ format, which also displays known instances from homologous proteins through a novel ‘Instance Mapper’ protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation. PMID:19920119

  5. Aztec, Incan and Mayan Motifs...Lead to Distinctive Designs.

    ERIC Educational Resources Information Center

    Shields, Joanne

    2001-01-01

    Describes an art project for seventh-grade students in which they choose motifs based on Incan, Aztec, and Mayan Indian materials to incorporate into two-dimensional designs. Explains that the activity objective is to create a unified, balanced and pleasing composition using a minimum of three motifs. (CMK)

  6. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  7. Functional conservation of cis-regulatory elements of heat-shock genes over long evolutionary distances.

    PubMed

    He, Zhengying; Eichel, Kelsie; Ruvinsky, Ilya

    2011-01-01

    Transcriptional control of gene regulation is an intricate process that requires precise orchestration of a number of molecular components. Studying its evolution can serve as a useful model for understanding how complex molecular machines evolve. One way to investigate evolution of transcriptional regulation is to test the functions of cis-elements from one species in a distant relative. Previous results suggested that few, if any, tissue-specific promoters from Drosophila are faithfully expressed in C. elegans. Here we show that, in contrast, promoters of fly and human heat-shock genes are upregulated in C. elegans upon exposure to heat. Inducibility under conditions of heat shock may represent a relatively simple "on-off" response, whereas complex expression patterns require integration of multiple signals. Our results suggest that simpler aspects of regulatory logic may be retained over longer periods of evolutionary time, while more complex ones may be diverging more rapidly. PMID:21799932

  8. Cis-regulatory programs in the development and evolution of vertebrate paired appendages.

    PubMed

    Gehrke, Andrew R; Shubin, Neil H

    2016-09-01

    Differential gene expression is the core of development, mediating the genetic changes necessary for determining cell identity. The regulation of gene activity by cis-acting elements (e.g., enhancers) is a crucial mechanism for determining differential gene activity by precise control of gene expression in embryonic space and time. Modifications to regulatory regions can have profound impacts on phenotype, and therefore developmental and evolutionary biologists have increasingly focused on elucidating the transcriptional control of genes that build and pattern body plans. Here, we trace the evolutionary history of transcriptional control of three loci key to vertebrate appendage development (Fgf8, Shh, and HoxD/A). Within and across these regulatory modules, we find both complex and flexible regulation in contrast with more fixed enhancers that appear unchanged over vast timescales of vertebrate evolution. The transcriptional control of vertebrate appendage development was likely already incredibly complex in the common ancestor of fish, implying that subtle changes to regulatory networks were more likely responsible for alterations in phenotype rather than the de novo addition of whole regulatory domains. Finally, we discuss the dangers of relying on inter-species transgenesis when testing enhancer function, and call for more controlled regulatory swap experiments when inferring the evolutionary history of enhancer elements. PMID:26783722

  9. A cis-Regulatory Mutation of PDSS2 Causes Silky-Feather in Chickens

    PubMed Central

    Feng, Chungang; Gao, Yu; Dorshorst, Ben; Song, Chi; Gu, Xiaorong; Li, Qingyuan; Li, Jinxiu; Liu, Tongxin; Rubin, Carl-Johan; Zhao, Yiqiang; Wang, Yanqiang; Fei, Jing; Li, Huifang; Chen, Kuanwei; Qu, Hao; Shu, Dingming; Ashwell, Chris; Da, Yang; Andersson, Leif; Hu, Xiaoxiang; Li, Ning

    2014-01-01

    Silky-feather has been selected and fixed in some breeds due to its unique appearance. This phenotype is caused by a single recessive gene (hookless, h). Here we map the silky-feather locus to chromosome 3 by linkage analysis and subsequently fine-map it to an 18.9 kb interval using the identical by descent (IBD) method. Further analysis reveals that a C to G transversion located upstream of the prenyl (decaprenyl) diphosphate synthase, subunit 2 (PDSS2) gene is causing silky-feather. All silky-feather birds are homozygous for the G allele. The silky-feather mutation significantly decreases the expression of PDSS2 during feather development in vivo. Consistent with the regulatory effect, the C to G transversion is shown to remarkably reduce PDSS2 promoter activity in vitro. We report a new example of feather structure variation associated with a spontaneous mutation and provide new insight into the PDSS2 function. PMID:25166907

  10. Exaptation of Transposable Elements into Novel Cis-Regulatory Elements: Is the Evidence Always Strong?

    PubMed Central

    de Souza, Flávio S.J.; Franchini, Lucía F.; Rubinstein, Marcelo

    2013-01-01

    Transposable elements (TEs) are mobile genetic sequences that can jump around the genome from one location to another, behaving as genomic parasites. TEs have been particularly effective in colonizing mammalian genomes, and such heavy TE load is expected to have conditioned genome evolution. Indeed, studies conducted both at the gene and genome levels have uncovered TE insertions that seem to have been co-opted—or exapted—by providing transcription factor binding sites (TFBSs) that serve as promoters and enhancers, leading to the hypothesis that TE exaptation is a major factor in the evolution of gene regulation. Here, we critically review the evidence for exaptation of TE-derived sequences as TFBSs, promoters, enhancers, and silencers/insulators both at the gene and genome levels. We classify the functional impact attributed to TE insertions into four categories of increasing complexity and argue that so far very few studies have conclusively demonstrated exaptation of TEs as transcriptional regulatory regions. We also contend that many genome-wide studies dealing with TE exaptation in recent lineages of mammals are still inconclusive and that the hypothesis of rapid transcriptional regulatory rewiring mediated by TE mobilization must be taken with caution. Finally, we suggest experimental approaches that may help attributing higher-order functions to candidate exapted TEs. PMID:23486611

  11. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  12. Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

    SciTech Connect

    Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis; Rubin, EdwardM.; Couronne, Olivier

    2005-06-13

    Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A total of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.

  13. A cis-regulatory mutation of PDSS2 causes silky-feather in chickens.

    PubMed

    Feng, Chungang; Gao, Yu; Dorshorst, Ben; Song, Chi; Gu, Xiaorong; Li, Qingyuan; Li, Jinxiu; Liu, Tongxin; Rubin, Carl-Johan; Zhao, Yiqiang; Wang, Yanqiang; Fei, Jing; Li, Huifang; Chen, Kuanwei; Qu, Hao; Shu, Dingming; Ashwell, Chris; Da, Yang; Andersson, Leif; Hu, Xiaoxiang; Li, Ning

    2014-08-01

    Silky-feather has been selected and fixed in some breeds due to its unique appearance. This phenotype is caused by a single recessive gene (hookless, h). Here we map the silky-feather locus to chromosome 3 by linkage analysis and subsequently fine-map it to an 18.9 kb interval using the identical by descent (IBD) method. Further analysis reveals that a C to G transversion located upstream of the prenyl (decaprenyl) diphosphate synthase, subunit 2 (PDSS2) gene is causing silky-feather. All silky-feather birds are homozygous for the G allele. The silky-feather mutation significantly decreases the expression of PDSS2 during feather development in vivo. Consistent with the regulatory effect, the C to G transversion is shown to remarkably reduce PDSS2 promoter activity in vitro. We report a new example of feather structure variation associated with a spontaneous mutation and provide new insight into the PDSS2 function. PMID:25166907

  14. Characterization of "cis"-regulatory elements ("c"RE) associated with mammary gland function

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Bos taurus genome assembly has propelled dairy science into a new era; still, most of the information encoded in the genome has not yet been decoded. The human Encyclopedia of DNA Elements (ENCODE) project has spearheaded the identification and annotation of functional genomic elements in the hu...

  15. Evolving New Skeletal Traits by cis-Regulatory Changes in Bone Morphogenetic Proteins.

    PubMed

    Indjeian, Vahan B; Kingman, Garrett A; Jones, Felicity C; Guenther, Catherine A; Grimwood, Jane; Schmutz, Jeremy; Myers, Richard M; Kingsley, David M

    2016-01-14

    Changes in bone size and shape are defining features of many vertebrates. Here we use genetic crosses and comparative genomics to identify specific regulatory DNA alterations controlling skeletal evolution. Armor bone-size differences in sticklebacks map to a major effect locus overlapping BMP family member GDF6. Freshwater fish express more GDF6 due in part to a transposon insertion, and transgenic overexpression of GDF6 phenocopies evolutionary changes in armor-plate size. The human GDF6 locus also has undergone distinctive regulatory evolution, including complete loss of an enhancer that is otherwise highly conserved between chimps and other mammals. Functional tests show that the ancestral enhancer drives expression in hindlimbs but not forelimbs, in locations that have been specifically modified during the human transition to bipedalism. Both gain and loss of regulatory elements can localize BMP changes to specific anatomical locations, providing a flexible regulatory basis for evolving species-specific changes in skeletal form. PMID:26774823

  16. New cis-regulatory elements in the Rht-D1b locus region of wheat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fifteen gene-containing BACs with accumulated length of 1.82-Mb from the Rht-D1b locus region weresequenced and compared in detail with the orthologous regions of rice, sorghum, and maize. Our results show that Rht-D1b represents a conserved genomic region as implied by high gene sequence identity...

  17. Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome

    PubMed Central

    Naville, Magali; Ishibashi, Minaka; Ferg, Marco; Bengani, Hemant; Rinkwitz, Silke; Krecsmarik, Monika; Hawkins, Thomas A.; Wilson, Stephen W.; Manning, Elizabeth; Chilamakuri, Chandra S. R.; Wilson, David I.; Louis, Alexandra; Lucy Raymond, F.; Rastegar, Sepand; Strähle, Uwe; Lenhard, Boris; Bally-Cuif, Laure; van Heyningen, Veronica; FitzPatrick, David R.; Becker, Thomas S.; Roest Crollius, Hugues

    2015-01-01

    Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression. PMID:25908307

  18. Tripartite motif 32 prevents pathological cardiac hypertrophy.

    PubMed

    Chen, Lijuan; Huang, Jia; Ji, Yanxiao; Zhang, Xiaojing; Wang, Pixiao; Deng, Keqiong; Jiang, Xi; Ma, Genshan; Li, Hongliang

    2016-05-01

    TRIM32 (tripartite motif 32) is widely accepted to be an E3 ligase that interacts with and eventually ubiquitylates multiple substrates. TRIM32 mutants have been associated with LGMD-2H (limb girdle muscular dystrophy 2H). However, whether TRIM32 is involved in cardiac hypertrophy induced by biomechanical stresses and neurohumoral mediators remains unclear. We generated mice and isolated NRCMs (neonatal rat cardiomyocytes) that overexpressed or were deficient in TRIM32 to investigate the effect of TRIM32 on AB (aortic banding) or AngII (angiotensin II)-mediated cardiac hypertrophy. Echocardiography and both pathological and molecular analyses were used to determine the extent of cardiac hypertrophy and subsequent fibrosis. Our results showed that overexpression of TRIM32 in the heart significantly alleviated the hypertrophic response induced by pressure overload, whereas TRIM32 deficiency dramatically aggravated pathological cardiac remodelling. Similar results were also found in cultured NRCMs incubated with AngII. Mechanistically, the present study suggests that TRIM32 exerts cardioprotective action by interruption of Akt- but not MAPK (mitogen-dependent protein kinase)-dependent signalling pathways. Additionally, inactivation of Akt by LY294002 offset the exacerbated hypertrophic response induced by AB in TRIM32-deficient mice. In conclusion, the present study indicates that TRIM32 plays a protective role in AB-induced pathological cardiac remodelling by blocking Akt-dependent signalling. Therefore TRIM32 could be a novel therapeutic target for the prevention of cardiac hypertrophy and heart failure. PMID:26884348

  19. Tripartite motif 32 prevents pathological cardiac hypertrophy

    PubMed Central

    Huang, Jia; Ji, Yanxiao; Zhang, Xiaojing; Wang, Pixiao; Deng, Keqiong; Jiang, Xi; Ma, Genshan

    2016-01-01

    TRIM32 (tripartite motif 32) is widely accepted to be an E3 ligase that interacts with and eventually ubiquitylates multiple substrates. TRIM32 mutants have been associated with LGMD-2H (limb girdle muscular dystrophy 2H). However, whether TRIM32 is involved in cardiac hypertrophy induced by biomechanical stresses and neurohumoral mediators remains unclear. We generated mice and isolated NRCMs (neonatal rat cardiomyocytes) that overexpressed or were deficient in TRIM32 to investigate the effect of TRIM32 on AB (aortic banding) or AngII (angiotensin II)-mediated cardiac hypertrophy. Echocardiography and both pathological and molecular analyses were used to determine the extent of cardiac hypertrophy and subsequent fibrosis. Our results showed that overexpression of TRIM32 in the heart significantly alleviated the hypertrophic response induced by pressure overload, whereas TRIM32 deficiency dramatically aggravated pathological cardiac remodelling. Similar results were also found in cultured NRCMs incubated with AngII. Mechanistically, the present study suggests that TRIM32 exerts cardioprotective action by interruption of Akt- but not MAPK (mitogen-dependent protein kinase)-dependent signalling pathways. Additionally, inactivation of Akt by LY294002 offset the exacerbated hypertrophic response induced by AB in TRIM32-deficient mice. In conclusion, the present study indicates that TRIM32 plays a protective role in AB-induced pathological cardiac remodelling by blocking Akt-dependent signalling. Therefore TRIM32 could be a novel therapeutic target for the prevention of cardiac hypertrophy and heart failure. PMID:26884348

  20. Recurrent Structural Motifs in Non-Homologous Protein Structures

    PubMed Central

    Johansson, Maria U.; Zoete, Vincent; Guex, Nicolas

    2013-01-01

    We have extracted an extensive collection of recurrent structural motifs (RSMs), which consist of sequentially non-contiguous structural motifs (4–6 residues), each of which appears with very similar conformation in three or more mutually unrelated protein structures. We find that the proteins in our set are covered to a substantial extent by the recurrent non-contiguous structural motifs, especially the helix and strand regions. Computational alanine scanning calculations indicate that the average folding free energy changes upon alanine mutation for most types of non-alanine residues are higher for amino acids that are present in recurrent structural motifs than for amino acids that are not. The non-alanine amino acids that are most common in the recurrent structural motifs, i.e., phenylalanine, isoleucine, leucine, valine and tyrosine and the less abundant methionine and tryptophan, have the largest folding free energy changes. This indicates that the recurrent structural motifs, as we define them, describe recurrent structural patterns that are important for protein stability. In view of their properties, such structural motifs are potentially useful for inter-residue contact prediction and protein structure refinement. PMID:23574940

  1. BlockLogo: visualization of peptide and sequence motif conservation.

    PubMed

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L; Zhang, Guang Lan; Brusic, Vladimir

    2013-12-31

    BlockLogo is a web-server application for the visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine the specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://met-hilab.bu.edu/blocklogo/. PMID:24001880

  2. Regulatory Elements of the Floral Homeotic Gene AGAMOUS Identified by Phylogenetic Footprinting and ShadowingW⃞

    PubMed Central

    Hong, Ray L.; Hamaguchi, Lynn; Busch, Maximilian A.; Weigel, Detlef

    2003-01-01

    In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3-kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae species, several other motifs, but not the LFY and WUS binding sites identified previously, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for the activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection but also demonstrate that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites. PMID:12782724

  3. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  4. Coherent feedforward transcriptional regulatory motifs enhance drug resistance

    NASA Astrophysics Data System (ADS)

    Charlebois, Daniel A.; Balázsi, Gábor; Kærn, Mads

    2014-05-01

    Fluctuations in gene expression give identical cells access to a spectrum of phenotypes that can serve as a transient, nongenetic basis for natural selection by temporarily increasing drug resistance. In this study, we demonstrate using mathematical modeling and simulation that certain gene regulatory network motifs, specifically coherent feedforward loop motifs, can facilitate the development of nongenetic resistance by increasing cell-to-cell variability and the time scale at which beneficial phenotypic states can be maintained. Our results highlight how regulatory network motifs enabling transient, nongenetic inheritance play an important role in defining reproductive fitness in adverse environments and provide a selective advantage subject to evolutionary pressure.

  5. Seeing the B-A-C-H motif

    NASA Astrophysics Data System (ADS)

    Catravas, Palmyra

    2005-09-01

    Musical compositions can be thought of as complex, multidimensional data sets. Compositions based on the B-A-C-H motif (a four-note motif of the pitches of the last name of Johann Sebastian Bach) span several centuries of evolving compositional styles and provide an intriguing set for analysis since they contain a common feature, the motif, buried in dissimilar contexts. We will present analyses which highlight the content of this unusual set of pieces, with emphasis on visual display of information.

  6. Emergence of Connectivity Motifs in Networks of Model Neurons with Short- and Long-Term Plastic Synapses

    PubMed Central

    2014-01-01

    Recent experimental data from the rodent cerebral cortex and olfactory bulb indicate that specific connectivity motifs are correlated with short-term dynamics of excitatory synaptic transmission. It was observed that neurons with short-term facilitating synapses form predominantly reciprocal pairwise connections, while neurons with short-term depressing synapses form predominantly unidirectional pairwise connections. The cause of these structural differences in excitatory synaptic microcircuits is unknown. We show that these connectivity motifs emerge in networks of model neurons, from the interactions between short-term synaptic dynamics (SD) and long-term spike-timing dependent plasticity (STDP). While the impact of STDP on SD was shown in simultaneous neuronal pair recordings in vitro, the mutual interactions between STDP and SD in large networks are still the subject of intense research. Our approach combines an SD phenomenological model with an STDP model that faithfully captures long-term plasticity dependence on both spike times and frequency. As a proof of concept, we first simulate and analyze recurrent networks of spiking neurons with random initial connection efficacies and where synapses are either all short-term facilitating or all depressing. For identical external inputs to the network, and as a direct consequence of internally generated activity, we find that networks with depressing synapses evolve unidirectional connectivity motifs, while networks with facilitating synapses evolve reciprocal connectivity motifs. We then show that the same results hold for heterogeneous networks, including both facilitating and depressing synapses. This does not contradict a recent theory that proposes that motifs are shaped by external inputs, but rather complements it by examining the role of both the external inputs and the internally generated network activity. Our study highlights the conditions under which SD-STDP might explain the correlation between

  7. Conserved rhodopsin intradiscal structural motifs mediate stabilization: effects of zinc.

    PubMed

    Gleim, Scott; Stojanovic, Aleksandar; Arehart, Eric; Byington, Daniel; Hwa, John

    2009-03-01

    Retinitis pigmentosa (RP), a neurodegenerative disorder, can arise from single point mutations in rhodopsin, leading to a cascade of protein instability, misfolding, aggregation, rod cell death, retinal degeneration, and ultimately blindness. Divalent cations, such as zinc and copper, have allosteric effects on misfolded aggregates of comparable neurodegenerative disorders including Alzheimer disease, prion diseases, and ALS. We report that two structurally conserved low-affinity zinc coordination motifs, located among a cluster of RP mutations in the intradiscal loop region, mediate dose-dependent rhodopsin destabilization. Disruption of native interactions involving histidines 100 and 195, through site-directed mutagenesis or exogenous zinc coordination, results in significant loss of receptor stability. Furthermore, chelation with EDTA stabilizes the structure of both wild-type rhodopsin and the most prevalent rhodopsin RP mutation, P(23)H. These interactions suggest that homeostatic regulation of trace metal concentrations in the rod outer segment of the retina may be important both physiologically and for an important cluster of RP mutations. Furthermore, with a growing awareness of allosteric zinc binding domains on a diverse range of GPCRs, such principles may apply to many other receptors and their associated diseases. PMID:19206210

  8. Conserved rhodopsin intradiscal structural motifs mediate stabilization; effects of zinc†

    PubMed Central

    Gleim, Scott; Stojanovic, Aleksandar; Arehart, Eric; Byington, Daniel; Hwa, John

    2009-01-01

    Retinitis pigmentosa (RP), a neurodegenerative disorder, can arise from single point mutations in rhodopsin, leading to a cascade of protein instability, misfolding, aggregation, rod cell death, retinal degeneration, and ultimately blindness. Divalent cations, such as zinc and copper, have allosteric effects on misfolded aggregates of comparable neurodegenerative disorders including Alzheimer disease, prion diseases, and ALS. We report that two structurally conserved low-affinity zinc coordination motifs, located among a cluster of RP mutations in the intradiscal loop region, mediate dose-dependent rhodopsin destabilization. Disruption of native interactions involving histidines 100 and 195, through site-directed mutagenesis or exogenous zinc coordination, results in significant loss of receptor stability. Furthermore, chelation with EDTA stabilizes the structure of both wild type rhodopsin and the most prevalent rhodopsin RP mutation, P23H. These interactions suggest that homeostatic regulation of trace metal concentrations in the rod outer segment of the retina may be important both physiologically and for an important cluster of RP mutations. Furthermore, with a growing awareness of allosteric zinc binding domains on a diverse range of GPCRs, such principles may apply to many other receptors and their associated diseases. PMID:19206210

  9. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

    PubMed Central

    Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

    2016-01-01

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C. PMID:26782977

  10. An essential GT motif in the lamin A promoter mediates activation by CREB-binding protein

    SciTech Connect

    Janaki Ramaiah, M.; Parnaik, Veena K. . E-mail: veenap@ccmb.res.in

    2006-09-29

    Lamin A is an important component of nuclear architecture in mammalian cells. Mutations in the human lamin A gene lead to highly degenerative disorders that affect specific tissues. In studies directed towards understanding the mode of regulation of the lamin A promoter, we have identified an essential GT motif at -55 position by reporter gene assays and mutational analysis. Binding of this sequence to Sp transcription factors has been observed in electrophoretic mobility shift assays and by chromatin immunoprecipitation studies. Further functional analysis by co-expression of recombinant proteins and ChIP assays has shown an important regulatory role for CREB-binding protein in promoter activation, which is mediated by the GT motif.

  11. A Common Structural Motif in the Binding of Virulence Factors to Bacterial Secretion Chaperones

    SciTech Connect

    Lilic,M.; Vujanac, M.; Stebbins, C.

    2006-01-01

    Salmonella invasion protein A (SipA) is translocated into host cells by a type III secretion system (T3SS) and comprises two regions: one domain binds its cognate type III secretion chaperone, InvB, in the bacterium to facilitate translocation, while a second domain functions in the host cell, contributing to bacterial uptake by polymerizing actin. We present here the crystal structures of the SipA chaperone binding domain (CBD) alone and in complex with InvB. The SipA CBD is found to consist of a nonglobular polypeptide as well as a large globular domain, both of which are necessary for binding to InvB. We also identify a structural motif that may direct virulence factors to their cognate chaperones in a diverse range of pathogenic bacteria. Disruption of this structural motif leads to a destabilization of several chaperone-substrate complexes from different species, as well as an impairment of secretion in Salmonella.

  12. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

    NASA Astrophysics Data System (ADS)

    Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

    2016-01-01

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C.

  13. A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery

    PubMed Central

    Yen, Ian E. H.; Lin, Xin; Zhang, Jiong; Ravikumar, Pradeep; Dhillon, Inderjit S.

    2016-01-01

    Multiple Sequence Alignment and Motif Discovery, known as NP-hard problems, are two fundamental tasks in Bioinformatics. Existing approaches to these two problems are based on either local search methods such as Expectation Maximization (EM), Gibbs Sampling or greedy heuristic methods. In this work, we develop a convex relaxation approach to both problems based on the recent concept of atomic norm and develop a new algorithm, termed Greedy Direction Method of Multiplier, for solving the convex relaxation with two convex atomic constraints. Experiments show that our convex relaxation approach produces solutions of higher quality than those standard tools widely-used in Bioinformatics community on the Multiple Sequence Alignment and Motif Discovery problems. PMID:27559428

  14. Beta-turn propensities as paradigms for the analysis of structural motifs to engineer protein stability.

    PubMed Central

    Ohage, E. C.; Graml, W.; Walter, M. M.; Steinbacher, S.; Steipe, B.

    1997-01-01

    The thermodynamic stability of a protein provides an experimental metric for the relationship of protein sequence and native structure. We have investigated an approach based on an analysis of the structural database for stability engineering of an immunoglobulin variable domain. The most frequently occurring residues in specific positions of beta-turn motifs were predicted to increase the folding stability of mutants that were constructed by site-directed mutagenesis. Even in positions in which different residues are conserved in immunoglobulin sequences, the predictions were confirmed. Frequently, mutants with increased beta-turn propensities display increased folding cooperativities, suggesting pronounced effects on the unfolded state independent of the expected effect on conformational entropy. We conclude that structural motifs with predominantly local interactions can serve as templates with which patterns of sequence preferences can be extracted from the database of protein structures. Such preferences can predict the stability effects of mutations for protein engineering and design. PMID:9007995

  15. A million peptide motifs for the molecular biologist.

    PubMed

    Tompa, Peter; Davey, Norman E; Gibson, Toby J; Babu, M Madan

    2014-07-17

    A molecular description of functional modules in the cell is the focus of many high-throughput studies in the postgenomic era. A large portion of biomolecular interactions in virtually all cellular processes is mediated by compact interaction modules, referred to as peptide motifs. Such motifs are typically less than ten residues in length, occur within intrinsically disordered regions, and are recognized and/or posttranslationally modified by structured domains of the interacting partner. In this review, we suggest that there might be over a million instances of peptide motifs in the human proteome. While this staggering number suggests that peptide motifs are numerous and the most understudied functional module in the cell, it also holds great opportunities for new discoveries. PMID:25038412

  16. 10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  17. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES. - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  18. The building blocks and motifs of RNA architecture

    PubMed Central

    Leontis, Neocles B; Lescoute, Aurelie; Westhof, Eric

    2010-01-01

    RNA motifs can be defined broadly as recurrent structural elements containing multiple intramolecular RNA–RNA interactions, as observed in atomic-resolution RNA structures. They constitute the modular building blocks of RNA architecture, which is organized hierarchically. Recent work has focused on analyzing RNA backbone conformations to identify, define and search for new instances of recurrent motifs in X-ray structures. One current view asserts that recurrent RNA strand segments with characteristic backbone configurations qualify as independent motifs. Other considerations indicate that, to characterize modular motifs, one must take into account the larger structural context of such strand segments. This follows the biologically relevant motivation, which is to identify RNA structural characteristics that are subject to sequence constraints and that thus relate RNA architectures to sequences. PMID:16713707

  19. Identification of Internal Transcribed Spacer Sequence Motifs in Truffles: a First Step toward Their DNA Bar Coding▿ †

    PubMed Central

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-01-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (≤50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  20. Identification of internal transcribed spacer sequence motifs in truffles: a first step toward their DNA bar coding.

    PubMed

    El Karkouri, Khalid; Murat, Claude; Zampieri, Elisa; Bonfante, Paola

    2007-08-01

    This work presents DNA sequence motifs from the internal transcribed spacer (ITS) of the nuclear rRNA repeat unit which are useful for the identification of five European and Asiatic truffles (Tuber magnatum, T. melanosporum, T. indicum, T. aestivum, and T. mesentericum). Truffles are edible mycorrhizal ascomycetes that show similar morphological characteristics but that have distinct organoleptic and economic values. A total of 36 out of 46 ITS1 or ITS2 sequence motifs have allowed an accurate in silico distinction of the five truffles to be made (i.e., by pattern matching and/or BLAST analysis on downloaded GenBank sequences and directly against GenBank databases). The motifs considered the intraspecific genetic variability of each species, including rare haplotypes, and assigned their respective species from either the ascocarps or ectomycorrhizas. The data indicate that short ITS1 or ITS2 motifs (< or = 50 bp in size) can be considered promising tools for truffle species identification. A dot blot hybridization analysis of T. magnatum and T. melanosporum compared with other close relatives or distant lineages allowed at least one highly specific motif to be identified for each species. These results were confirmed in a blind test which included new field isolates. The current work has provided a reliable new tool for a truffle oligonucleotide bar code and identification in ecological and evolutionary studies. PMID:17601808

  1. Motif-Synchronization: A new method for analysis of dynamic brain networks with EEG

    NASA Astrophysics Data System (ADS)

    Rosário, R. S.; Cardoso, P. T.; Muñoz, M. A.; Montoya, P.; Miranda, J. G. V.

    2015-12-01

    The major aim of this work was to propose a new association method known as Motif-Synchronization. This method was developed to provide information about the synchronization degree and direction between two nodes of a network by counting the number of occurrences of some patterns between any two time series. The second objective of this work was to present a new methodology for the analysis of dynamic brain networks, by combining the Time-Varying Graph (TVG) method with a directional association method. We further applied the new algorithms to a set of human electroencephalogram (EEG) signals to perform a dynamic analysis of the brain functional networks (BFN).

  2. Robust and Adaptive MicroRNA-Mediated Incoherent Feedforward Motifs

    NASA Astrophysics Data System (ADS)

    Xu, Feng-Dan; Liu, Zeng-Rong; Zhang, Zhi-Yong; Shen, Jian-Wei

    2009-02-01

    We integrate transcriptional and post-transcriptional regulation into microRNA-mediated incoherent feedforward motifs and analyse their dynamical behaviour and functions. The analysis show that the behaviour of the system is almost uninfluenced by the varying input in certain ranges and by introducing of delay and noise. The results indicate that microRNA-mediated incoherent feedforward motifs greatly enhance the robustness of gene regulation.

  3. Network motif-based method for identifying coronary artery disease

    PubMed Central

    LI, YIN; CONG, YAN; ZHAO, YUN

    2016-01-01

    The present study aimed to develop a more efficient method for identifying coronary artery disease (CAD) than the conventional method using individual differentially expressed genes (DEGs). GSE42148 gene microarray data were downloaded, preprocessed and screened for DEGs. Additionally, based on transcriptional regulation data obtained from ENCODE database and protein-protein interaction data from the HPRD, the common genes were downloaded and compared with genes annotated from gene microarrays to screen additional common genes in order to construct an integrated regulation network. FANMOD was then used to detect significant three-gene network motifs. Subsequently, GlobalAncova was used to screen differential three-gene network motifs between the CAD group and the normal control data from GSE42148. Genes involved in the differential network motifs were then subjected to functional annotation and pathway enrichment analysis. Finally, clustering analysis of the CAD and control samples was performed based on individual DEGs and the top 20 network motifs identified. In total, 9,008 significant three-node network motifs were detected from the integrated regulation network; these were categorized into 22 interaction modes, each containing a minimum of one transcription factor. Subsequently, 1,132 differential network motifs involving 697 genes were screened between the CAD and control group. The 697 genes were enriched in 154 gene ontology terms, including 119 biological processes, and 14 KEGG pathways. Identifying patients with CAD based on the top 20 network motifs provided increased accuracy compared with the conventional method based on individual DEGs. The results of the present study indicate that the network motif-based method is more efficient and accurate for identifying CAD patients than the conventional method based on individual DEGs. PMID:27347046

  4. Transcriptional Network Growing Models Using Motif-Based Preferential Attachment

    PubMed Central

    Abdelzaher, Ahmed F.; Al-Musawi, Ahmad F.; Ghosh, Preetam; Mayo, Michael L.; Perkins, Edward J.

    2015-01-01

    Understanding relationships between architectural properties of gene-regulatory networks (GRNs) has been one of the major goals in systems biology and bioinformatics, as it can provide insights into, e.g., disease dynamics and drug development. Such GRNs are characterized by their scale-free degree distributions and existence of network motifs – i.e., small-node subgraphs that occur more abundantly in GRNs than expected from chance alone. Because these transcriptional modules represent “building blocks” of complex networks and exhibit a wide range of functional and dynamical properties, they may contribute to the remarkable robustness and dynamical stability associated with the whole of GRNs. Here, we developed network-construction models to better understand this relationship, which produce randomized GRNs by using transcriptional motifs as the fundamental growth unit in contrast to other methods that construct similar networks on a node-by-node basis. Because this model produces networks with a prescribed lower bound on the number of choice transcriptional motifs (e.g., downlinks, feed-forward loops), its fidelity to the motif distributions observed in model organisms represents an improvement over existing methods, which we validated by contrasting their resultant motif and degree distributions against existing network-growth models and data from the model organism of the bacterium Escherichia coli. These models may therefore serve as novel testbeds for further elucidating relationships between the topology of transcriptional motifs and network-wide dynamical properties. PMID:26528473

  5. Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

    PubMed

    Roy, Indranil; Aluru, Srinivas

    2016-01-01

    Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology. PMID:26886735

  6. Finding specific RNA motifs: Function in a zeptomole world?

    PubMed Central

    KNIGHT, ROB; YARUS, MICHAEL

    2003-01-01

    We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865

  7. Motif for controllable toggle switch in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhao, Chen; Bin, Ao; Ye, Weiming; Fan, Ying; Di, Zengru

    2015-02-01

    Toggle switch as a common phenomenon in gene regulatory networks has been recognized important for biological functions. Despite much effort dedicated to understanding the toggle switch and designing synthetic biology circuit to achieve the biological function, we still lack a comprehensive understanding of the intrinsic dynamics behind such phenomenon and the minimum structure that is imperative for producing toggle switch. In this paper, we discover a minimum structure, a motif that enables a controllable toggle switch. In particular, the motif consists of a transformative double negative feedback loop (DNFL) that is regulated by an additional driver node. By enumerating all possible regulatory configurations from the driver node, we identify two types of motifs associated with the toggle switch that is captured by the existence of bistable states. The toggle switch is controllable in the sense that the gap between the bistable states is adjustable as determined by the regulatory strength from the driver nodes. We test the effect of the motifs in self-oscillating gene regulatory network (SON) with respect to the interplay between the motifs and the other genes, and find that the switching dynamics of the whole network can be successfully controlled insofar as the network contains a single motif. Our findings are important to uncover the underlying nonlinear dynamics of controllable toggle switch and can have implications in devising biology circuit in the field of synthetic biology.

  8. Heparin-Binding Motifs and Biofilm Formation by Candida albicans

    PubMed Central

    Green, Julianne V.; Orsborn, Kris I.; Zhang, Minlu; Tan, Queenie K. G.; Greis, Kenneth D.; Porollo, Alexey; Andes, David R.; Long Lu, Jason; Hostetter, Margaret K.

    2013-01-01

    Candida albicans is a leading pathogen in infections of central venous catheters, which are frequently infused with heparin. Binding of C. albicans to medically relevant concentrations of soluble and plate-bound heparin was demonstrable by confocal microscopy and enzyme-linked immunosorbent assay (ELISA). A sequence-based search identified 34 C. albicans surface proteins containing ≥1 match to linear heparin-binding motifs. The virulence factor Int1 contained the most putative heparin-binding motifs (n = 5); peptides encompassing 2 of 5 motifs bound to heparin-Sepharose. Alanine substitution of lysine residues K805/K806 in 804QKKHQIHK811 (motif 1 of Int1) markedly attenuated biofilm formation in central venous catheters in rats, whereas alanine substitution of K1595/R1596 in 1593FKKRFFKL1600 (motif 4 of Int1) did not impair biofilm formation. Affinity-purified immunoglobulin G (IgG) recognizing motif 1 abolished biofilm formation in central venous catheters; preimmune IgG had no effect. After heparin treatment of C. albicans, soluble peptides from multiple C. albicans surface proteins were detected, such as Eno1, Pgk1, Tdh3, and Ssa1/2 but not Int1, suggesting that heparin changes candidal surface structures and may modify some antigens critical for immune recognition. These studies define a new mechanism of biofilm formation for C. albicans and a novel strategy for inhibiting catheter-associated biofilms. PMID:23904295

  9. cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).

  10. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

    PubMed Central

    2014-01-01

    Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519

  11. Calmodulation meta-analysis: predicting calmodulin binding via canonical motif clustering.

    PubMed

    Mruk, Karen; Farley, Brian M; Ritacco, Alan W; Kobertz, William R

    2014-07-01

    The calcium-binding protein calmodulin (CaM) directly binds to membrane transport proteins to modulate their function in response to changes in intracellular calcium concentrations. Because CaM recognizes and binds to a wide variety of target sequences, identifying CaM-binding sites is difficult, requiring intensive sequence gazing and extensive biochemical analysis. Here, we describe a straightforward computational script that rapidly identifies canonical CaM-binding motifs within an amino acid sequence. Analysis of the target sequences from high resolution CaM-peptide structures using this script revealed that CaM often binds to sequences that have multiple overlapping canonical CaM-binding motifs. The addition of a positive charge discriminator to this meta-analysis resulted in a tool that identifies potential CaM-binding domains within a given sequence. To allow users to search for CaM-binding motifs within a protein of interest, perform the meta-analysis, and then compare the results to target peptide-CaM structures deposited in the Protein Data Bank, we created a website and online database. The availability of these tools and analyses will facilitate the design of CaM-related studies of ion channels and membrane transport proteins. PMID:24935744

  12. Mutational analysis of the adeno-associated virus type 2 Rep68 protein helicase motifs.

    PubMed

    Walker, S L; Wonderling, R S; Owens, R A

    1997-09-01

    The adeno-associated virus type 2 (AAV) Rep78 and Rep68 proteins are required for viral replication. These proteins are encoded by unspliced and spliced transcripts, respectively, from the p5 promoter of AAV and therefore have overlapping amino acid sequences. The Rep78 and Rep68 proteins share a variety of activities including endonuclease, helicase, and ATPase activities and the ability to bind AAV hairpin DNA. The part of the amino acid sequence which is identical in Rep78 and Rep68 contains consensus helicase motifs that are conserved among the parvovirus replication proteins. In the present study, we mutated highly conserved amino acids within these helicase motifs. The mutant proteins were synthesized as maltose binding protein-Rep68 fusions in Escherichia coli cells and affinity purified on amylose resin. The fusion proteins were assayed in vitro, and their activities were directly compared to those of the fusion protein MBP-Rep68 delta, which contains most of the amino acid sequences common to Rep78 and Rep68 and was demonstrated previously to have all of the in vitro activities of wild-type Rep78 and Rep68. Our analysis showed that almost all mutations in the putative helicase motifs severely reduced or abolished helicase activity in vitro. Most mutants also had ATPase activity less than one-eighth of the wild-type levels and lacked endonuclease activity. PMID:9261429

  13. Tetratricopeptide Repeat Motifs in the World of Bacterial Pathogens: Role in Virulence Mechanisms

    PubMed Central

    Straskova, Adela; Dankova, Vera; Hartlova, Anetta; Ceckova, Martina; Staud, Frantisek; Stulik, Jiri

    2013-01-01

    The tetratricopeptide repeat (TPR) structural motif is known to occur in a wide variety of proteins present in prokaryotic and eukaryotic organisms. The TPR motif represents an elegant module for the assembly of various multiprotein complexes, and thus, TPR-containing proteins often play roles in vital cell processes. As the TPR profile is well defined, the complete TPR protein repertoire of a bacterium with a known genomic sequence can be predicted. This provides a tremendous opportunity for investigators to identify new TPR-containing proteins and study them in detail. In the past decade, TPR-containing proteins of bacterial pathogens have been reported to be directly related to virulence-associated functions. In this minireview, we summarize the current knowledge of the TPR-containing proteins involved in virulence mechanisms of bacterial pathogens while highlighting the importance of TPR motifs for the proper functioning of class II chaperones of a type III secretion system in the pathogenesis of Yersinia, Pseudomonas, and Shigella. PMID:23264049

  14. Ubiquitous presence of the hammerhead ribozyme motif along the tree of life

    PubMed Central

    de la Peña, Marcos; García-Robles, Inmaculada

    2010-01-01

    Examples of small self-cleaving RNAs embedded in noncoding regions already have been found to be involved in the control of gene expression, although their origin remains uncertain. In this work, we show the widespread occurrence of the hammerhead ribozyme (HHR) motif among genomes from the Bacteria, Chromalveolata, Plantae, and Metazoa kingdoms. Intergenic HHRs were detected in three different bacterial genomes, whereas metagenomic data from Galapagos Islands showed the occurrence of similar ribozymes that could be regarded as direct relics from the RNA world. Among eukaryotes, HHRs were detected in the genomes of three water molds as well as 20 plant species, ranging from unicellular algae to vascular plants. These HHRs were very similar to those previously described in small RNA plant pathogens and, in some cases, appeared as close tandem repetitions. A parallel situation of tandemly repeated HHR motifs was also detected in the genomes of lower metazoans from cnidarians to invertebrates, with special emphasis among hematophagous and parasitic organisms. Altogether, these findings unveil the HHR as a widespread motif in DNA genomes, which would be involved in new forms of retrotransposable elements. PMID:20705646

  15. In vivo analysis of Caenorhabditis elegans noncoding RNA promoter motifs

    PubMed Central

    Li, Tiantian; He, Housheng; Wang, Yunfei; Zheng, Haixia; Skogerbø, Geir; Chen, Runsheng

    2008-01-01

    Background Noncoding RNAs (ncRNAs) play important roles in a variety of cellular processes. Characterizing the transcriptional activity of ncRNA promoters is therefore a critical step toward understanding the complex cellular roles of ncRNAs. Results Here we present an in vivo transcriptional analysis of three C. elegans ncRNA upstream motifs (UM1-3). Transcriptional activity of all three motifs has been demonstrated, and mutational analysis revealed differential contributions of different parts of each motif. We showed that upstream motif 1 (UM1) can drive the expression of green fluorescent protein (GFP), and utilized this for detailed analysis of temporal and spatial expression patterns of 5 SL2 RNAs. Upstream motifs 2 and 3 do not drive GFP expression, and termination at consecutive T runs suggests transcription by RNA polymerase III. The UM2 sequence resembles the tRNA promoter, and is actually embedded within its own short-lived, primary transcript. This is a structure which is also found at a few plant and yeast loci, and may indicate an evolutionarily very old dicistronic transcription pattern in which a tRNA serves as a promoter for an adjacent snoRNA. Conclusion The study has demonstrated that the three upstream motifs UM1-3 have promoter activity. The UM1 sequence can drive expression of GFP, which allows for the use of UM1::GFP fusion constructs to study temporal-spatial expression patterns of UM1 ncRNA loci. The UM1 loci appear to act in concert with other upstream sequences, whereas the transcriptional activities of the UM2 and UM3 are confined to the motifs themselves. PMID:18680611

  16. The C-terminal CGHC motif of protein disulfide isomerase supports thrombosis

    PubMed Central

    Zhou, Junsong; Wu, Yi; Wang, Lu; Rauova, Lubica; Hayes, Vincent M.; Poncz, Mortimer; Essex, David W.

    2015-01-01

    Protein disulfide isomerase (PDI) has two distinct CGHC redox-active sites; however, the contribution of these sites during different physiologic reactions, including thrombosis, is unknown. Here, we evaluated the role of PDI and redox-active sites of PDI in thrombosis by generating mice with blood cells and vessel wall cells lacking PDI (Mx1-Cre Pdifl/fl mice) and transgenic mice harboring PDI that lacks a functional C-terminal CGHC motif [PDI(ss-oo) mice]. Both mouse models showed decreased fibrin deposition and platelet accumulation in laser-induced cremaster arteriole injury, and PDI(ss-oo) mice had attenuated platelet accumulation in FeCl3-induced mesenteric arterial injury. These defects were rescued by infusion of recombinant PDI containing only a functional C-terminal CGHC motif [PDI(oo-ss)]. PDI infusion restored fibrin formation, but not platelet accumulation, in eptifibatide-treated wild-type mice, suggesting a direct role of PDI in coagulation. In vitro aggregation of platelets from PDI(ss-oo) mice and PDI-null platelets was reduced; however, this defect was rescued by recombinant PDI(oo-ss). In human platelets, recombinant PDI(ss-oo) inhibited aggregation, while recombinant PDI(oo-ss) potentiated aggregation. Platelet secretion assays demonstrated that the C-terminal CGHC motif of PDI is important for P-selectin expression and ATP secretion through a non-αIIbβ3 substrate. In summary, our results indicate that the C-terminal CGHC motif of PDI is important for platelet function and coagulation. PMID:26529254

  17. Structural motifs, mixing, and segregation effects in 38-atom binary clusters

    NASA Astrophysics Data System (ADS)

    Paz-Borbón, Lauro Oliver; Johnston, Roy L.; Barcaro, Giovanni; Fortunelli, Alessandro

    2008-04-01

    Thirty eight-atom binary clusters composed of elements from groups 10 and 11 of the Periodic Table mixing a second-row with a third-row transition metal (TM) (i.e., clusters composed of the four pairs: Pd-Pt, Ag-Au, Pd-Au, and Ag-Pt) are studied through a combined empirical-potential (EP)/density functional (DF) method. A "system comparison" approach is adopted in order to analyze a wide diversity of structural motifs, and the energy competition among different structural motifs is studied at the DF level for these systems, mainly focusing on the composition 24-14 (the first number refers to the second-row TM atom) but also considering selected motifs with compositions 19-19 (of interest for investigating surface segregation effects) and 32-6 (also 14-24 and 6-32 for the Pd-Au pair). The results confirm the EP predictions about the stability of crystalline structures at this size for the Au-Pd pair but with decahedral or mixed fivefold-symmetric/closed-packed structures in close competition with fcc motifs for the Ag-Au or Ag-Pt and Pd-Pt pairs, respectively. Overall, the EP description is found to be reasonably accurate for the Pd-Pt and Au-Pd pairs, whereas it is less reliable for the Ag-Au and Ag-Pt pairs due to electronic structure (charge transfer or directionality) effects. The driving force to core-shell chemical ordering is put on a quantitative basis, and surface segregation of the most cohesive element into the core is confirmed, with the exception of the Ag-Au pair for which charge transfer effects favor the segregation of Au to the surface of the clusters.

  18. Discovering Motifs in Ranked Lists of DNA Sequences

    PubMed Central

    Eden, Eran; Lipson, Doron; Yogev, Sivan; Yakhini, Zohar

    2007-01-01

    Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP–chip (chromatin immuno-precipitation on a microarray) measurements. Several major challenges in sequence motif discovery still require consideration: (i) the need for a principled approach to partitioning the data into target and background sets; (ii) the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii) the need for an appropriate framework for accounting for motif multiplicity; (iv) the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs), which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP–chip and CpG methylation data and obtained the following results. (i) Identification of 50 novel putative transcription factor (TF) binding sites in yeast ChIP–chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii) Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked. Overall

  19. Crystal structure of bacterial cell-surface alginate-binding protein with an M75 peptidase motif

    SciTech Connect

    Maruyama, Yukie; Ochiai, Akihito; Mikami, Bunzo; Hashimoto, Wataru; Murata, Kousaku

    2011-02-18

    Research highlights: {yields} Bacterial alginate-binding Algp7 is similar to component EfeO of Fe{sup 2+} transporter. {yields} We determined the crystal structure of Algp7 with a metal-binding motif. {yields} Algp7 consists of two helical bundles formed through duplication of a single bundle. {yields} A deep cleft involved in alginate binding locates around the metal-binding site. {yields} Algp7 may function as a Fe{sup 2+}-chelated alginate-binding protein. -- Abstract: A gram-negative Sphingomonas sp. A1 directly incorporates alginate polysaccharide into the cytoplasm via the cell-surface pit and ABC transporter. A cell-surface alginate-binding protein, Algp7, functions as a concentrator of the polysaccharide in the pit. Based on the primary structure and genetic organization in the bacterial genome, Algp7 was found to be homologous to an M75 peptidase motif-containing EfeO, a component of a ferrous ion transporter. Despite the presence of an M75 peptidase motif with high similarity, the Algp7 protein purified from recombinant Escherichia coli cells was inert on insulin B chain and N-benzoyl-Phe-Val-Arg-p-nitroanilide, both of which are substrates for a typical M75 peptidase, imelysin, from Pseudomonas aeruginosa. The X-ray crystallographic structure of Algp7 was determined at 2.10 A resolution by single-wavelength anomalous diffraction. Although a metal-binding motif, HxxE, conserved in zinc ion-dependent M75 peptidases is also found in Algp7, the crystal structure of Algp7 contains no metal even at the motif. The protein consists of two structurally similar up-and-down helical bundles as the basic scaffold. A deep cleft between the bundles is sufficiently large to accommodate macromolecules such as alginate polysaccharide. This is the first structural report on a bacterial cell-surface alginate-binding protein with an M75 peptidase motif.

  20. Elongated Polyproline Motifs Facilitate Enamel Evolution through Matrix Subunit Compaction

    PubMed Central

    Luan, Xianghong; Dangaria, Smit; Walker, Cameron; Allen, Michael; Kulkarni, Ashok; Gibson, Carolyn; Braatz, Richard; Liao, Xiubei; Diekwisch, Thomas G. H.

    2009-01-01

    Vertebrate body designs rely on hydroxyapatite as the principal mineral component of relatively light-weight, articulated endoskeletons and sophisticated tooth-bearing jaws, facilitating rapid movement and efficient predation. Biological mineralization and skeletal growth are frequently accomplished through proteins containing polyproline repeat elements. Through their well-defined yet mobile and flexible structure polyproline-rich proteins control mineral shape and contribute many other biological functions including Alzheimer's amyloid aggregation and prolamine plant storage. In the present study we have hypothesized that polyproline repeat proteins exert their control over biological events such as mineral growth, plaque aggregation, or viscous adhesion by altering the length of their central repeat domain, resulting in dramatic changes in supramolecular assembly dimensions. In order to test our hypothesis, we have used the vertebrate mineralization protein amelogenin as an exemplar and determined the biological effect of the four-fold increased polyproline tandem repeat length in the amphibian/mammalian transition. To study the effect of polyproline repeat length on matrix assembly, protein structure, and apatite crystal growth, we have measured supramolecular assembly dimensions in various vertebrates using atomic force microscopy, tested the effect of protein assemblies on crystal growth by electron microscopy, generated a transgenic mouse model to examine the effect of an abbreviated polyproline sequence on crystal growth, and determined the structure of polyproline repeat elements using 3D NMR. Our study shows that an increase in PXX/PXQ tandem repeat motif length results (i) in a compaction of protein matrix subunit dimensions, (ii) reduced conformational variability, (iii) an increase in polyproline II helices, and (iv) promotion of apatite crystal length. Together, these findings establish a direct relationship between polyproline tandem repeat fragment

  1. Interconnected Network Motifs Control Podocyte Morphology and Kidney Function

    PubMed Central

    Azeloglu, Evren U.; Hardy, Simon V.; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y.; Fang, Wei; Xiong, Huabao; Neves, Susana R.; Jain, Mohit R.; Li, Hong; Ma’ayan, Avi; Gordon, Ronald E.; He, John Cijiang; Iyengar, Ravi

    2014-01-01

    Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3′,5′-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element–binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor–driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease. PMID:24497609

  2. BC1 RNA motifs required for dendritic transport in vivo

    PubMed Central

    Robeck, Thomas; Skryabin, Boris V.; Rozhdestvensky, Timofey S.; Skryabin, Anastasiya B.; Brosius, Jürgen

    2016-01-01

    BC1 RNA is a small brain specific non-protein coding RNA. It is transported from the cell body into dendrites where it is involved in the fine-tuning translational control. Due to its compactness and established secondary structure, BC1 RNA is an ideal model for investigating the motifs necessary for dendritic localization. Previously, microinjection of in vitro transcribed BC1 RNA mutants into the soma of cultured primary neurons suggested the importance of RNA motifs for dendritic targeting. These ex vivo experiments identified a single bulged nucleotide (U22) and a putative K-turn (GA motif) structure required for dendritic localization or distal transport, respectively. We generated six transgenic mouse lines (three founders each) containing neuronally expressing BC1 RNA variants on a BC1 RNA knockout mouse background. In contrast to ex vivo data, we did not find indications of reduction or abolition of dendritic BC1 RNA localization in the mutants devoid of the GA motif or the bulged nucleotide. We confirmed the ex vivo data, which showed that the triloop terminal sequence had no consequence on dendritic transport. Interestingly, changing the triloop supporting structure completely abolished dendritic localization of BC1 RNA. We propose a novel RNA motif important for dendritic transport in vivo. PMID:27350115

  3. MALISAM: a database of structurally analogous motifs in proteins.

    PubMed

    Cheng, Hua; Kim, Bong-Hyun; Grishin, Nick V

    2008-01-01

    MALISAM (manual alignments for structurally analogous motifs) represents the first database containing pairs of structural analogs and their alignments. To find reliable analogs, we developed an approach based on three ideas. First, an insertion together with a part of the evolutionary core of one domain family (a hybrid motif) is analogous to a similar motif contained within the core of another domain family. Second, a motif at an interface, formed by secondary structural elements (SSEs) contributed by two or more domains or subunits contacting along that interface, is analogous to a similar motif present in the core of a single domain. Third, an artificial protein obtained through selection from random peptides or in sequence design experiments not biased by sequences of a particular homologous family, is analogous to a structurally similar natural protein. Each analogous pair is superimposed and aligned manually, as well as by several commonly used programs. Applications of this database may range from protein evolution studies, e.g. development of remote homology inference tools and discriminators between homologs and analogs, to protein-folding research, since in the absence of evolutionary reasons, similarity between proteins is caused by structural and folding constraints. The database is publicly available at http://prodata.swmed.edu/malisam. PMID:17855399

  4. DynaMIT: the dynamic motif integration toolkit

    PubMed Central

    Dassi, Erik; Quattrone, Alessandro

    2016-01-01

    De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org. PMID:26253738

  5. Mining tertiary structural motifs for assessment of designability.

    PubMed

    Zhang, Jian; Grigoryan, Gevorg

    2013-01-01

    The observation of a limited secondary-structural alphabet in native proteins, with significant sequence preferences, has profoundly influenced the fields of protein design and structure prediction (Simons, Kooperberg, Huang, & Baker, 1997; Verschueren et al., 2011). In the era of structural genomics, as the size of the structural dataset continues to grow rapidly, it is becoming possible to extend this analysis to tertiary structural motifs and their sequences. For a hypothetical tertiary motif, the rate of its utilization in natural proteins may be used to assess its designability-the ease with which the motif can be realized with natural amino acids. This requires a structural similarity search methodology, which rather than looking for global topological agreement (more appropriate for categorization of full proteins or domains), identifies detailed geometric matches. In this chapter, we introduce such a method, called MaDCaT, and demonstrate its use by assessing the designability landscapes of two tertiary structural motifs. We also show that such analysis can establish structure/sequence links by providing the sequence constraints necessary to encode designable motifs. As logical extension of their secondary-structure counterparts, tertiary structural preferences will likely prove extremely useful in de novo protein design and structure prediction. PMID:23422424

  6. BC1 RNA motifs required for dendritic transport in vivo.

    PubMed

    Robeck, Thomas; Skryabin, Boris V; Rozhdestvensky, Timofey S; Skryabin, Anastasiya B; Brosius, Jürgen

    2016-01-01

    BC1 RNA is a small brain specific non-protein coding RNA. It is transported from the cell body into dendrites where it is involved in the fine-tuning translational control. Due to its compactness and established secondary structure, BC1 RNA is an ideal model for investigating the motifs necessary for dendritic localization. Previously, microinjection of in vitro transcribed BC1 RNA mutants into the soma of cultured primary neurons suggested the importance of RNA motifs for dendritic targeting. These ex vivo experiments identified a single bulged nucleotide (U22) and a putative K-turn (GA motif) structure required for dendritic localization or distal transport, respectively. We generated six transgenic mouse lines (three founders each) containing neuronally expressing BC1 RNA variants on a BC1 RNA knockout mouse background. In contrast to ex vivo data, we did not find indications of reduction or abolition of dendritic BC1 RNA localization in the mutants devoid of the GA motif or the bulged nucleotide. We confirmed the ex vivo data, which showed that the triloop terminal sequence had no consequence on dendritic transport. Interestingly, changing the triloop supporting structure completely abolished dendritic localization of BC1 RNA. We propose a novel RNA motif important for dendritic transport in vivo. PMID:27350115

  7. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  8. Identification of a promoter motif involved in Curtovirus sense-gene expression in transgenic Arabidopsis.

    PubMed

    Hur, Jingyung; Choi, Eunseok; Buckley, Kenneth J; Lee, Sukchan; Davis, Keith R

    2008-08-31

    Expression of the seven open reading frames (ORFs) of single-stranded DNA Curtoviruses such as Beet curly top virus (BCTV) and Beet severe curly top virus (BSCTV) is driven by a bi-directional promoter. To investigate this bi-directional promoter activity with respect to viral late gene expression, transgenic Arabidopsis plants expressing a GUS reporter gene under the control of either the BCTV or BSCTV bi-directional promoter were constructed. Transgenic plants harboring constructs showed higher expression levels when the promoter of the less virulent BCTV was used than when the promoter of the more virulent BSCTV was used. In transgenic seedlings, the reporter gene constructs were expressed primarily in actively dividing tissues such as root tips and apical meristems. As the transgenic plants matured, reporter gene expression diminished but viral infection of mature transgenic plants restored reporter gene expression, particularly in transgenic plants containing BCTV virion-sense gene promoter constructs. A 30 base pair conserved late element (CLE) motif was identified that was present three times in tandem in the BCTV promoter and once in that of BSCTV. Progressive deletion of these repeats from the BCTV promoter resulted in decreased reporter gene expression, but BSCTV promoters in which one or two extra copies of this motif were inserted did not exhibit increased late gene promoter activity. These results demonstrate that Curtovirus late gene expression by virion-sense promoters depends on the developmental stage of the host plant as well as on the number of CLE motifs present in the promoter. PMID:18596416

  9. Robustness to noise in synchronization of network motifs: Experimental results

    NASA Astrophysics Data System (ADS)

    Buscarino, Arturo; Fortuna, Luigi; Frasca, Mattia; Iachello, Marco; Pham, Viet-Thanh

    2012-12-01

    In this work, we experimentally investigate the robustness to noise of synchronization in all the four-nodes network motifs. The experimental setup consists of four Chua's circuits diffusively coupled in order to implement the six different undirected network motifs that can be obtained with four nodes. In this experimental setup, synchronization in the presence of noise injected in one of the network nodes is investigated and network motifs are compared in terms of the synchronization error obtained. The analysis has been then extended to some selected case studies of networks with five and six nodes. Numerical simulations have been also performed and results in agreement with experiments have been obtained. A correlation between node degree and robustness to noise has been found also in these networks.

  10. Identification of high-molecular-weight proteins with multiple EGF-like motifs by motif-trap screening.

    PubMed

    Nakayama, M; Nakajima, D; Nagase, T; Nomura, N; Seki, N; Ohara, O

    1998-07-01

    To identify large proteins with an EGF-like-motif in a systematic manner, we developed a computer-assisted method called motif-trap screening. The method exploits 5'-end single-pass sequence data obtained from a pool of cDNAs whose sizes exceed 5 kb. Using this screening procedure, we were able to identify five known and nine new genes for proteins with multiple EGF-like-motifs from 8000 redundant human brain cDNA clones. These new genes were found to encode a novel mammalian homologue of Drosophila fat protein, two seven-transmembrane proteins containing multiple cadherin and EGF-like motifs, two mammalian homologues of Drosophila slit protein, an unidentified LDL receptor-like protein, and three totally uncharacterized proteins. The organization of the domains in the proteins, together with their expression profiles and fine chromosomal locations, has indicated their biological significance, demonstrating that motif-trap screening is a powerful tool for the discovery of new genes that have been difficult to identify by conventional methods. PMID:9693030

  11. Selection against spurious promoter motifs correlates withtranslational efficiency across bacteria

    SciTech Connect

    Froula, Jeffrey L.; Francino, M. Pilar

    2007-05-01

    Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the -10 promoter motifs that bind the {sigma}{sup 70} subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of -10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, -10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also implies that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.

  12. Analysis of interactions between ribosomal proteins and RNA structural motifs

    PubMed Central

    2010-01-01

    Background One important goal of structural bioinformatics is to recognize and predict the interactions between protein binding sites and RNA. Recently, a comprehensive analysis of ribosomal proteins and their interactions with rRNA has been done. Interesting results emerged from the comparison of r-proteins within the small subunit in T. thermophilus and E. coli, supporting the idea of a core made by both RNA and proteins, conserved by evolution. Recent work showed also that ribosomal RNA is modularly composed. Motifs are generally single-stranded sequences of consecutive nucleotides (ssRNA) with characteristic folding. The role of these motifs in protein-RNA interactions has been so far only sparsely investigated. Results This work explores the role of RNA structural motifs in the interaction of proteins with ribosomal RNA (rRNA). We analyze composition, local geometries and conformation of interface regions involving motifs such as tetraloops, kink turns and single extruded nucleotides. We construct an interaction map of protein binding sites that allows us to identify the common types of shared 3-D physicochemical binding patterns for tetraloops. Furthermore, we investigate the protein binding pockets that accommodate single extruded nucleotides either involved in kink-turns or in arbitrary RNA strands. This analysis reveals a new structural motif, called tripod. It corresponds to small pockets consisting of three aminoacids arranged at the vertices of an almost equilateral triangle. We developed a search procedure for the recognition of tripods, based on an empirical tripod fingerprint. Conclusion A comparative analysis with the overall RNA surface and interfaces shows that contact surfaces involving RNA motifs have distinctive features that may be useful for the recognition and prediction of interactions. PMID:20122215

  13. Specific RNA self-assembly with minimal paranemic motifs

    PubMed Central

    Afonin, Kirill A.; Cieply, Dennis J.; Leontis, Neocles B.

    2016-01-01

    The paranemic crossover (PX) is a motif for assembling two nucleic acid molecules using Watson-Crick (WC) basepairing without unfolding pre-formed secondary structure in the individual molecules. Once formed, the paranemic assembly motif comprises adjacent parallel double helices that cross over at every possible point over the length of the motif. The interaction is reversible as it does not require denaturation of basepairs internal to each interacting molecular unit. Paranemic assembly has been demonstrated for DNA but not for RNA, and only for motifs with four or more cross-over points and lengths of five or more helical half-turns. Here we report the design of RNA molecules that paranemically assemble with the minimum number of two cross-overs spanning the major groove to form paranemic motifs with a length of three half-turns (3HT). Dissociation constants (Kds) were measured for series of molecules in which the number of basepairs between the cross-over points was varied from five to eight basepairs. The paranemic 3HT complex with six basepairs (3HT_6M) was found to be the most stable with Kd = 1×10−8 M. The half-time for kinetic exchange of the 3HT_6M complex was determined to be ~100 minutes, from which we calculated association and dissociation rate constants ka = 5.11×103 M−1sec−1 and kd = 5.11×10−5 sec−1. RNA paranemic assembly of 3HT and 5HT complexes is blocked by single-base substitutions that disrupt individual inter-molecular Watson-Crick basepairs and is restored by compensatory substitutions that restore those basepairs. The 3HT motif appears suitable for specific, programmable, and reversible tecto-RNA self-assembly for constructing artificial RNA molecular machines. PMID:18072767

  14. SPIC: A novel similarity metric for comparing transcription factor binding site motifs based on information contents

    PubMed Central

    2013-01-01

    Background Discovering transcription factor binding sites (TFBS) is one of primary challenges to decipher complex gene regulatory networks encrypted in a genome. A set of short DNA sequences identified by a transcription factor (TF) is known as a motif, which can be expressed accurately in matrix form such as a position-specific scoring matrix (PSSM) and a position frequency matrix. Very frequently, we need to query a motif in a database of motifs by seeking its similar motifs, merge similar TFBS motifs possibly identified by the same TF, separate irrelevant motifs, or filter out spurious motifs. Therefore, a novel metric is required to seize slight differences between irrelevant motifs and highlight the similarity between motifs of the same group in all these applications. While there are already several metrics for motif similarity proposed before, their performance is still far from satisfactory for these applications. Methods A novel metric has been proposed in this paper with name as SPIC (Similarity with Position Information Contents) for measuring the similarity between a column of a motif and a column of another motif. When defining this similarity score, we consider the likelihood that the column of the first motif's PFM can be produced by the column of the second motif's PSSM, and multiply the likelihood by the information content of the column of the second motif's PSSM, and vise versa. We evaluated the performance of SPIC combined with a local or a global alignment method having a function for affine gap penalty, for computing the similarity between two motifs. We also compared SPIC with seven existing state-of-the-arts metrics for their capability of clustering motifs from the same group and retrieving motifs from a database on three datasets. Results When used jointly with the Smith-Waterman local alignment method with an affine gap penalty function (gap open penalty is equal to1, gap extension penalty is equal to 0.5), SPIC outperforms the seven

  15. Using the Gibbs Motif Sampler for Phylogenetic Footprinting

    SciTech Connect

    Thompson, William; Conlan, Sean; McCue, Lee Ann; Lawrence, Charles

    2007-07-01

    The Gibbs Motif Sampler (Gibbs) (1) is a software package used to predict conserved elements in biopolymer sequences. While the software can be used to locate conserved motifs in protein sequences, its most common use is the prediction of transcription factor binding sites (TFBSs) in promoters upstream of gene sequences. We will describe approaches that use Gibbs to locate TFBSs in a collection of orthologous nucleotide sequences, i.e. phylogenetic footprinting. To illustrate this technique, we present examples that use Gibbs to detect binding sites for the transcription factor LexA in orthologous sequence data from representative species belonging to two different proteobacterial divisions.

  16. A General RNA Motif for Cellular Transfection

    PubMed Central

    Magalhães, Maria LB; Byrom, Michelle; Yan, Amy; Kelly, Linsley; Li, Na; Furtado, Raquel; Palliser, Deborah; Ellington, Andrew D; Levy, Matthew

    2012-01-01

    We have developed a selection scheme to generate nucleic acid sequences that recognize and directly internalize into mammalian cells without the aid of conventional delivery methods. To demonstrate the generality of the technology, two independent selections with different starting pools were performed against distinct target cells. Each selection yielded a single highly functional sequence, both of which folded into a common core structure. This internalization signal can be adapted for use as a general purpose reagent for transfection into a wide variety of cell types including primary cells. PMID:22233578

  17. Application of Synthetic Peptide Arrays To Uncover Cyclic Di-GMP Binding Motifs

    PubMed Central

    Düvel, Juliane; Bense, Sarina; Möller, Stefan; Bertinetti, Daniela; Schwede, Frank; Morr, Michael; Eckweiler, Denitsa; Genieser, Hans-Gottfried; Jänsch, Lothar; Herberg, Friedrich W.; Frank, Ronald

    2015-01-01

    ABSTRACT High levels of the universal bacterial second messenger cyclic di-GMP (c-di-GMP) promote the establishment of surface-attached growth in many bacteria. Not only can c-di-GMP bind to nucleic acids and directly control gene expression, but it also binds to a diverse array of proteins of specialized functions and orchestrates their activity. Since its development in the early 1990s, the synthetic peptide array technique has become a powerful tool for high-throughput approaches and was successfully applied to investigate the binding specificity of protein-ligand interactions. In this study, we used peptide arrays to uncover the c-di-GMP binding site of a Pseudomonas aeruginosa protein (PA3740) that was isolated in a chemical proteomics approach. PA3740 was shown to bind c-di-GMP with a high affinity, and peptide arrays uncovered LKKALKKQTNLR to be a putative c-di-GMP binding motif. Most interestingly, different from the previously identified c-di-GMP binding motif of the PilZ domain (RXXXR) or the I site of diguanylate cyclases (RXXD), two leucine residues and a glutamine residue and not the charged amino acids provided the key residues of the binding sequence. Those three amino acids are highly conserved across PA3740 homologs, and their singular exchange to alanine reduced c-di-GMP binding within the full-length protein. IMPORTANCE In many bacterial pathogens the universal bacterial second messenger c-di-GMP governs the switch from the planktonic, motile mode of growth to the sessile, biofilm mode of growth. Bacteria adapt their intracellular c-di-GMP levels to a variety of environmental challenges. Several classes of c-di-GMP binding proteins have been structurally characterized, and diverse c-di-GMP binding domains have been identified. Nevertheless, for several c-di-GMP receptors, the binding motif remains to be determined. Here we show that the use of a synthetic peptide array allowed the identification of a c-di-GMP binding motif of a putative c

  18. Novel recognition motifs and biological functions of the RNA-binding protein HuD revealed by genome-wide identification of its targets

    PubMed Central

    Bolognani, Federico; Contente-Cuomo, Tania; Perrone-Bizzozero, Nora I.

    2010-01-01

    HuD is a neuronal ELAV-like RNA-binding protein (RBP) involved in nervous system development, regeneration, and learning and memory. This protein stabilizes mRNAs by binding to AU-rich instability elements (AREs) in their 3′ unstranslated regions (3′ UTR). To isolate its in vivo targets, messenger ribonucleoprotein (mRNP) complexes containing HuD were first immunoprecipitated from brain extracts and directly bound mRNAs identified by subsequent GST-HuD pull downs and microarray assays. Using the 3′ UTR sequences of the most enriched targets and the known sequence restrictions of the HuD ARE-binding site, we discovered three novel recognition motifs. Motifs 2 and 3 are U-rich whereas motif 1 is C-rich. In vitro binding assays indicated that HuD binds motif 3 with the highest affinity, followed by motifs 2 and 1, with less affinity. These motifs were found to be over-represented in brain mRNAs that are upregulated in HuD overexpressor mice, supporting the biological function of these sequences. Gene ontology analyses revealed that HuD targets are enriched in signaling pathways involved in neuronal differentiation and that many of these mRNAs encode other RBPs, translation factors and actin-binding proteins. These findings provide further insights into the post-transcriptional mechanisms by which HuD promotes neural development and synaptic plasticity. PMID:19846595

  19. Nephila clavipes Flagelliform silk-like GGX motifs contribute to extensibility and spacer motifs contribute to strength in synthetic spider silk fibers.

    PubMed

    Adrianos, Sherry L; Teulé, Florence; Hinman, Michael B; Jones, Justin A; Weber, Warner S; Yarger, Jeffery L; Lewis, Randolph V

    2013-06-10

    Flagelliform spider silk is the most extensible silk fiber produced by orb weaver spiders, though not as strong as the dragline silk of the spider. The motifs found in the core of the Nephila clavipes flagelliform Flag protein are GGX, spacer, and GPGGX. Flag does not contain the polyalanine motif known to provide the strength of dragline silk. To investigate the source of flagelliform fiber strength, four recombinant proteins were produced containing variations of the three core motifs of the Nephila clavipes flagelliform Flag protein that produces this type of fiber. The as-spun fibers were processed in 80% aqueous isopropanol using a standardized process for all four fiber types, which produced improved mechanical properties. Mechanical testing of the recombinant proteins determined that the GGX motif contributes extensibility and the spacer motif contributes strength to the recombinant fibers. Recombinant protein fibers containing the spacer motif were stronger than the proteins constructed without the spacer that contained only the GGX motif or the combination of the GGX and GPGGX motifs. The mechanical and structural X-ray diffraction analysis of the recombinant fibers provide data that suggests a functional role of the spacer motif that produces tensile strength, though the spacer motif is not clearly defined structurally. These results indicate that the spacer is likely a primary contributor of strength, with the GGX motif supplying mobility to the protein network of native N. clavipes flagelliform silk fibers. PMID:23646825

  20. Novel DNA Motif Binding Activity Observed In Vivo With an Estrogen Receptor α Mutant Mouse

    PubMed Central

    Li, Leping; Grimm, Sara A.; Winuthayanon, Wipawee; Hamilton, Katherine J.; Pockette, Brianna; Rubel, Cory A.; Pedersen, Lars C.; Fargo, David; Lanz, Rainer B.; DeMayo, Francesco J.; Schütz, Günther; Korach, Kenneth S.

    2014-01-01

    Estrogen receptor α (ERα) interacts with DNA directly or indirectly via other transcription factors, referred to as “tethering.” Evidence for tethering is based on in vitro studies and a widely used “KIKO” mouse model containing mutations that prevent direct estrogen response element DNA- binding. KIKO mice are infertile, due in part to the inability of estradiol (E2) to induce uterine epithelial proliferation. To elucidate the molecular events that prevent KIKO uterine growth, regulation of the pro-proliferative E2 target gene Klf4 and of Klf15, a progesterone (P4) target gene that opposes the pro-proliferative activity of KLF4, was evaluated. Klf4 induction was impaired in KIKO uteri; however, Klf15 was induced by E2 rather than by P4. Whole uterine chromatin immunoprecipitation-sequencing revealed enrichment of KIKO ERα binding to hormone response elements (HREs) motifs. KIKO binding to HRE motifs was verified using reporter gene and DNA-binding assays. Because the KIKO ERα has HRE DNA-binding activity, we evaluated the “EAAE” ERα, which has more severe DNA-binding domain mutations, and demonstrated a lack of estrogen response element or HRE reporter gene induction or DNA-binding. The EAAE mouse has an ERα null–like phenotype, with impaired uterine growth and transcriptional activity. Our findings demonstrate that the KIKO mouse model, which has been used by numerous investigators, cannot be used to establish biological functions for ERα tethering, because KIKO ERα effectively stimulates transcription using HRE motifs. The EAAE-ERα DNA-binding domain mutant mouse demonstrates that ERα DNA-binding is crucial for biological and transcriptional processes in reproductive tissues and that ERα tethering may not contribute to estrogen responsiveness in vivo. PMID:24713037

  1. Novel DNA motif binding activity observed in vivo with an estrogen receptor α mutant mouse.

    PubMed

    Hewitt, Sylvia C; Li, Leping; Grimm, Sara A; Winuthayanon, Wipawee; Hamilton, Katherine J; Pockette, Brianna; Rubel, Cory A; Pedersen, Lars C; Fargo, David; Lanz, Rainer B; DeMayo, Francesco J; Schütz, Günther; Korach, Kenneth S

    2014-06-01

    Estrogen receptor α (ERα) interacts with DNA directly or indirectly via other transcription factors, referred to as "tethering." Evidence for tethering is based on in vitro studies and a widely used "KIKO" mouse model containing mutations that prevent direct estrogen response element DNA- binding. KIKO mice are infertile, due in part to the inability of estradiol (E2) to induce uterine epithelial proliferation. To elucidate the molecular events that prevent KIKO uterine growth, regulation of the pro-proliferative E2 target gene Klf4 and of Klf15, a progesterone (P4) target gene that opposes the pro-proliferative activity of KLF4, was evaluated. Klf4 induction was impaired in KIKO uteri; however, Klf15 was induced by E2 rather than by P4. Whole uterine chromatin immunoprecipitation-sequencing revealed enrichment of KIKO ERα binding to hormone response elements (HREs) motifs. KIKO binding to HRE motifs was verified using reporter gene and DNA-binding assays. Because the KIKO ERα has HRE DNA-binding activity, we evaluated the "EAAE" ERα, which has more severe DNA-binding domain mutations, and demonstrated a lack of estrogen response element or HRE reporter gene induction or DNA-binding. The EAAE mouse has an ERα null-like phenotype, with impaired uterine growth and transcriptional activity. Our findings demonstrate that the KIKO mouse model, which has been used by numerous investigators, cannot be used to establish biological functions for ERα tethering, because KIKO ERα effectively stimulates transcription using HRE motifs. The EAAE-ERα DNA-binding domain mutant mouse demonstrates that ERα DNA-binding is crucial for biological and transcriptional processes in reproductive tissues and that ERα tethering may not contribute to estrogen responsiveness in vivo. PMID:24713037

  2. The Membrane-Bound NAC Transcription Factor ANAC013 Functions in Mitochondrial Retrograde Regulation of the Oxidative Stress Response in Arabidopsis[C][W

    PubMed Central

    De Clercq, Inge; Vermeirssen, Vanessa; Van Aken, Olivier; Vandepoele, Klaas; Murcha, Monika W.; Law, Simon R.; Inzé, Annelies; Ng, Sophia; Ivanova, Aneta; Rombaut, Debbie; van de Cotte, Brigitte; Jaspers, Pinja; Van de Peer, Yves; Kangasjärvi, Jaakko; Whelan, James; Van Breusegem, Frank

    2013-01-01

    Upon disturbance of their function by stress, mitochondria can signal to the nucleus to steer the expression of responsive genes. This mitochondria-to-nucleus communication is often referred to as mitochondrial retrograde regulation (MRR). Although reactive oxygen species and calcium are likely candidate signaling molecules for MRR, the protein signaling components in plants remain largely unknown. Through meta-analysis of transcriptome data, we detected a set of genes that are common and robust targets of MRR and used them as a bait to identify its transcriptional regulators. In the upstream regions of these mitochondrial dysfunction stimulon (MDS) genes, we found a cis-regulatory element, the mitochondrial dysfunction motif (MDM), which is necessary and sufficient for gene expression under various mitochondrial perturbation conditions. Yeast one-hybrid analysis and electrophoretic mobility shift assays revealed that the transmembrane domain–containing NO APICAL MERISTEM/ARABIDOPSIS TRANSCRIPTION ACTIVATION FACTOR/CUP-SHAPED COTYLEDON transcription factors (ANAC013, ANAC016, ANAC017, ANAC053, and ANAC078) bound to the MDM cis-regulatory element. We demonstrate that ANAC013 mediates MRR-induced expression of the MDS genes by direct interaction with the MDM cis-regulatory element and triggers increased oxidative stress tolerance. In conclusion, we characterized ANAC013 as a regulator of MRR upon stress in Arabidopsis thaliana. PMID:24045019

  3. Tertiary structure and function of an RNA motif required for plant vascular entry to initiate systemic trafficking.

    PubMed

    Zhong, Xuehua; Tao, Xiaorong; Stombaugh, Jesse; Leontis, Neocles; Ding, Biao

    2007-08-22

    Vascular entry is a decisive step for the initiation of long-distance movement of infectious and endogenous RNAs, silencing signals and developmental/defense signals in plants. However, the mechanisms remain poorly understood. We used Potato spindle tuber viroid (PSTVd) as a model to investigate the direct role of the RNA itself in vascular entry. We report here the identification of an RNA motif that is required for PSTVd to traffic from nonvascular into the vascular tissue phloem to initiate systemic infection. This motif consists of nucleotides U/C that form a water-inserted cis Watson-Crick/Watson-Crick base pair flanked by short helices that comprise canonical Watson-Crick/Watson-Crick base pairs. This tertiary structural model was inferred by comparison with X-ray crystal structures of similar motifs in rRNAs and is supported by combined mutagenesis and covariation analyses. Hydration pattern analysis suggests that water insertion induces a widened minor groove conducive to protein and/or RNA interactions. Our model and approaches have broad implications to investigate the RNA structural motifs in other RNAs for vascular entry and to study the basic principles of RNA structure-function relationships. PMID:17660743

  4. The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

    PubMed Central

    Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

    2015-01-01

    Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227

  5. Conserved motifs II to VI of DNA helicase II from Escherichia coli are all required for biological activity.

    PubMed Central

    Zhang, G; Deng, E; Baugh, L R; Hamilton, C M; Maples, V F; Kushner, S R

    1997-01-01

    There are seven conserved motifs (IA, IB, and II to VI) in DNA helicase II of Escherichia coli that have high homology among a large family of proteins involved in DNA metabolism. To address the functional importance of motifs II to VI, we employed site-directed mutagenesis to replace the charged amino acid residues in each motif with alanines. Cells carrying these mutant alleles exhibited higher UV and methyl methanesulfonate sensitivity, increased rates of spontaneous mutagenesis, and elevated levels of homologous recombination, indicating defects in both the excision repair and mismatch repair pathways. In addition, we also changed the highly conserved tyrosine(600) in motif VI to phenylalanine (uvrD309, Y600F). This mutant displayed a moderate increase in UV sensitivity but a decrease in spontaneous mutation rate, suggesting that DNA helicase II may have different functions in the two DNA repair pathways. Furthermore, a mutation in domain IV (uvrD307, R284A) significantly reduced the viability of some E. coli K-12 strains at 30 degrees C but not at 37 degrees C. The implications of these observations are discussed. PMID:9393722

  6. Skelemin, a cytoskeletal M-disc periphery protein, contains motifs of adhesion/recognition and intermediate filament proteins.

    PubMed

    Price, M G; Gomer, R H

    1993-10-15

    In striated muscle, myofibrils are anchored to an interconnecting cytoskeleton of desmin intermediate filaments. Skelemin (195 kDa) may be a link between myofibrils and the intermediate filament cytoskeleton. Skelemin partitions with desmin to the insoluble cytoskeleton, and increases the thickness of reconstituted intermediate filaments. Concentrated at the M-disc periphery, skelemin may also contact myosin filaments. We used immunoscreening to isolate a mouse muscle cDNA which encodes a protein with a calculated molecular mass of 185 kDa. Anti-skelemin antibodies bound to the protein products of each of three nonoverlapping regions of the open reading frame. Antibodies directed against the protein products of each one-third of the cDNA react with a 195-kDa muscle protein and stain the M-disc indistinguishably from the original anti-skelemin antibodies, suggesting that the cDNA encodes skelemin. A single skelemin mRNA is detected in muscle but not non-muscle tissues, consistent with immunostaining results. Skelemin is a member of a family of myosin-associated proteins containing fibronectin type III and immunoglobulin superfamily C2 motifs. Skelemin is unique in this family in having intermediate filament core-like motifs, one near each terminus. We hypothesize that skelemin could interact with myosin or myosin-associated proteins through its fibronectin and/or immunoglobulin motifs, and with intermediate filaments through intermediate filament-like motifs. PMID:8408035

  7. Characterization of TtALV2, an Essential Charged Repeat Motif Protein of the Tetrahymena thermophila Membrane Skeleton

    PubMed Central

    El-Haddad, Houda; Przyborski, Jude M.; Kraft, Lesleigh G. K.; McFadden, Geoffrey I.; Waller, Ross F.

    2013-01-01

    Alveolins are a recently described class of proteins common to all members of the superphylum Alveolata that are characterized by conserved charged repeat motifs (CRMs) but whose exact function remains unknown. We have analyzed the smaller of the two alveolins of Tetrahymena thermophila, TtALV2. The protein localizes to dispersed, broken patches arranged between the rows of the longitudinal microtubules. Macronuclear knockdown of Ttalv2 leads to multinuclear cells with no apparent cell polarity and randomly occurring cell protrusions, either by interrupting pellicle integrity or by disturbing cytokinesis. Correct association of TtALV2 with the alveoli or the pellicle is complex and depends on both the termini as well as the charged repeat motifs of the protein. Proteins containing similar CRMs are a dominant part of the ciliate membrane cytoskeleton, suggesting that these motifs may play a more general role in mediating membrane attachment and/or cytoskeletal association. To better understand their integration into the cytoskeleton, we localized a range of CRM-based fusion proteins, which suggested there is an inherent tendency for proteins with CRMs to be located in the peripheral cytoskeleton, some nucleating as filaments at the basal bodies. Even a synthetic protein, mimicking the charge and repeat pattern of these proteins, directed a reporter protein to a variety of peripheral cytoskeletal structures in Tetrahymena. These motifs might provide a blueprint for membrane and cytoskeleton affiliation in the complex pellicles of Alveolata. PMID:23606287

  8. Characterization of TtALV2, an essential charged repeat motif protein of the Tetrahymena thermophila membrane skeleton.

    PubMed

    El-Haddad, Houda; Przyborski, Jude M; Kraft, Lesleigh G K; McFadden, Geoffrey I; Waller, Ross F; Gould, Sven B

    2013-06-01

    Alveolins are a recently described class of proteins common to all members of the superphylum Alveolata that are characterized by conserved charged repeat motifs (CRMs) but whose exact function remains unknown. We have analyzed the smaller of the two alveolins of Tetrahymena thermophila, TtALV2. The protein localizes to dispersed, broken patches arranged between the rows of the longitudinal microtubules. Macronuclear knockdown of Ttalv2 leads to multinuclear cells with no apparent cell polarity and randomly occurring cell protrusions, either by interrupting pellicle integrity or by disturbing cytokinesis. Correct association of TtALV2 with the alveoli or the pellicle is complex and depends on both the termini as well as the charged repeat motifs of the protein. Proteins containing similar CRMs are a dominant part of the ciliate membrane cytoskeleton, suggesting that these motifs may play a more general role in mediating membrane attachment and/or cytoskeletal association. To better understand their integration into the cytoskeleton, we localized a range of CRM-based fusion proteins, which suggested there is an inherent tendency for proteins with CRMs to be located in the peripheral cytoskeleton, some nucleating as filaments at the basal bodies. Even a synthetic protein, mimicking the charge and repeat pattern of these proteins, directed a reporter protein to a variety of peripheral cytoskeletal structures in Tetrahymena. These motifs might provide a blueprint for membrane and cytoskeleton affiliation in the complex pellicles of Alveolata. PMID:23606287

  9. Redemptive Journey: The Storytelling Motif in Andersen's "The Snow Queen."

    ERIC Educational Resources Information Center

    Misheff, Sue

    1989-01-01

    Discusses how Hans Christian Andersen's "The Snow Queen" uses the motif of storytelling to describe the journey taken by the heroine Gerda. Identifies a story as that which is alive and active and which causes catharsis for those who participate in it. (MG)

  10. Fast, Sensitive Discovery of Conserved Genome-Wide Motifs

    PubMed Central

    Ihuegbu, Nnamdi E.; Buhler, Jeremy

    2012-01-01

    Abstract Regulatory sites that control gene expression are essential to the proper functioning of cells, and identifying them is critical for modeling regulatory networks. We have developed Magma (Multiple Aligner of Genomic Multiple Alignments), a software tool for multiple species, multiple gene motif discovery. Magma identifies putative regulatory sites that are conserved across multiple species and occur near multiple genes throughout a reference genome. Magma takes as input multiple alignments that can include gaps. It uses efficient clustering methods that make it about 70 times faster than PhyloNet, a previous program for this task, with slightly greater sensitivity. We ran Magma on all non-coding DNA conserved between Caenorhabditis elegans and five additional species, about 70 Mbp in total, in <4 h. We obtained 2,309 motifs with lengths of 6–20 bp, each occurring at least 10 times throughout the genome, which collectively covered about 566 kbp of the genomes, approximately 0.8% of the input. Predicted sites occurred in all types of non-coding sequence but were especially enriched in the promoter regions. Comparisons to several experimental datasets show that Magma motifs correspond to a variety of known regulatory motifs. PMID:22300316

  11. Motifs in triadic random graphs based on Steiner triple systems

    NASA Astrophysics Data System (ADS)

    Winkler, Marco; Reichardt, Jörg

    2013-08-01

    Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.

  12. DNA containing CpG motifs induces angiogenesis

    NASA Astrophysics Data System (ADS)

    Zheng, Mei; Klinman, Dennis M.; Gierynska, Malgorzata; Rouse, Barry T.

    2002-06-01

    New blood vessel formation in the cornea is an essential step in the pathogenesis of a blinding immunoinflammatory reaction caused by ocular infection with herpes simplex virus (HSV). By using a murine corneal micropocket assay, we found that HSV DNA (which contains a significant excess of potentially bioactive "CpG" motifs when compared with mammalian DNA) induces angiogenesis. Moreover, synthetic oligodeoxynucleotides containing CpG motifs attract inflammatory cells and stimulate the release of vascular endothelial growth factor (VEGF), which in turn triggers new blood vessel formation. In vitro, CpG DNA induces the J774A.1 murine macrophage cell line to produce VEGF. In vivo CpG-induced angiogenesis was blocked by the administration of anti-mVEGF Ab or the inclusion of "neutralizing" oligodeoxynucleotides that specifically oppose the stimulatory activity of CpG DNA. These findings establish that DNA containing bioactive CpG motifs induces angiogenesis, and suggest that CpG motifs in HSV DNA may contribute to the blinding lesions of stromal keratitis.

  13. 5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF BUILDING 1'S MAIN ENTRY TOWER (INCLUDING THE ENGAGED COLUMN CAPITALS, PILASTERS & CAPITALS, CORNICES, AND TERRA COTTA EAGLES); LOOKING SW FROM THE E WING ROOF. (Ryan) - Veterans Administration Medical Center, Building No. 1, Old State Route 13 West, Marion, Williamson County, IL

  14. Variable structure motifs for transcription factor binding sites

    PubMed Central

    2010-01-01

    Background Classically, models of DNA-transcription factor binding sites (TFBSs) have been based on relatively few known instances and have treated them as sites of fixed length using position weight matrices (PWMs). Various extensions to this model have been proposed, most of which take account of dependencies between the bases in the binding sites. However, some transcription factors are known to exhibit some flexibility and bind to DNA in more than one possible physical configuration. In some cases this variation is known to affect the function of binding sites. With the increasing volume of ChIP-seq data available it is now possible to investigate models that incorporate this flexibility. Previous work on variable length models has been constrained by: a focus on specific zinc finger proteins in yeast using restrictive models; a reliance on hand-crafted models for just one transcription factor at a time; and a lack of evaluation on realistically sized data sets. Results We re-analysed binding sites from the TRANSFAC database and found motivating examples where our new variable length model provides a better fit. We analysed several ChIP-seq data sets with a novel motif search algorithm and compared the results to one of the best standard PWM finders and a recently developed alternative method for finding motifs of variable structure. All the methods performed comparably in held-out cross validation tests. Known motifs of variable structure were recovered for p53, Stat5a and Stat5b. In addition our method recovered a novel generalised version of an existing PWM for Sp1 that allows for variable length binding. This motif improved classification performance. Conclusions We have presented a new gapped PWM model for variable length DNA binding sites that is not too restrictive nor over-parameterised. Our comparison with existing tools shows that on average it does not have better predictive accuracy than existing methods. However, it does provide more interpretable

  15. An Intronic Flk1 Enhancer Directs Arterial-Specific Expression via RBPJ-Mediated Venous Repression

    PubMed Central

    Becker, Philipp W.; Sacilotto, Natalia; Nornes, Svanhild; Neal, Alice; Thomas, Max O.; Liu, Ke; Preece, Chris; Ratnayaka, Indrika; Davies, Benjamin; Bou-Gharios, George

    2016-01-01

    Objective— The vascular endothelial growth factor (VEGF) receptor Flk1 is essential for vascular development, but the signaling and transcriptional pathways by which its expression is regulated in endothelial cells remain unclear. Although previous studies have identified 2 Flk1 regulatory enhancers, these are dispensable for Flk1 expression, indicating that additional enhancers contribute to Flk1 regulation in endothelial cells. In the present study, we sought to identify Flk1 enhancers contributing to expression in endothelial cells. Approach and Results— A region of the 10th intron of the Flk1 gene (Flk1in10) was identified as a putative enhancer and tested in mouse and zebrafish transgenic models. This region robustly directed reporter gene expression in arterial endothelial cells. Using a combination of targeted mutagenesis of transcription factor–binding sites and gene silencing of transcription factors, we found that Gata and Ets factors are required for Flk1in10 enhancer activity in all endothelial cells. Furthermore, we showed that activity of the Flk1in10 enhancer is restricted to arteries through repression of gene expression in venous endothelial cells by the Notch pathway transcriptional regulator Rbpj. Conclusions— This study demonstrates a novel mechanism of arterial–venous identity acquisition, indicates a direct link between the Notch and VEGF signaling pathways, and illustrates how cis-regulatory diversity permits differential expression outcomes from a limited repertoire of transcriptional regulators. PMID:27079877

  16. Discovering common stem–loop motifs in unaligned RNA sequences

    PubMed Central

    Gorodkin, Jan; Stricklin, Shawn L.; Stormo, Gary D.

    2001-01-01

    Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combination of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While several methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much more difficult because of covariation in the positions. We describe the combined use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem–loop RNA motifs in unaligned sequences, such as UTRs from post-transcriptionally coregulated genes. We evaluate the method on two datasets, one a section of rRNA genes with randomly truncated ends so that a global alignment is not possible, and the other a hyper-variable collection of IRE-like elements that were inserted into randomized UTR sequences. In both cases the combined method identified the motifs correctly, and in the rRNA example we show that it is capable of determining the structure, which includes bulge and internal loops as well as a variable length hairpin loop. Those automated results are quantitatively evaluated and found to agree closely with structures contained in curated databases, with correlation coefficients up to 0.9. A basic server, Stem–Loop Align SearcH (SLASH), which will perform stem–loop searches in unaligned RNA sequences, is available at http://www.bioinf.au.dk/slash/. PMID:11353083

  17. Mutation of the Zinc-Binding Metalloprotease Motif Affects Bacteroides fragilis Toxin Activity but Does Not Affect Propeptide Processing

    PubMed Central

    Franco, Augusto A.; Buckwold, Simy L.; Shin, Jai W.; Ascon, Miguel; Sears, Cynthia L.

    2005-01-01

    To evaluate the role of the zinc-binding metalloprotease in Bacteroides fragilis toxin (BFT) processing and activity, the zinc-binding consensus sequences (H348, E349, H352, G355, H358, and M366) were mutated by site-directed-mutagenesis. Our results indicated that single point mutations in the zinc-binding metalloprotease motif do not affect BFT processing but do reduce or eliminate BFT biologic activity in vitro. PMID:16041055

  18. Composite motifs integrating multiple protein structures increase sensitivity for function prediction.

    PubMed

    Chen, Brian Y; Bryant, Drew H; Cruess, Amanda E; Bylund, Joseph H; Fofanov, Viacheslav Y; Kristensen, David M; Kimmel, Marek; Lichtarge, Olivier; Kavraki, Lydia E

    2007-01-01

    The study of disease often hinges on the biological function of proteins, but determining protein function is a difficult experimental process. To minimize duplicated effort, algorithms for function prediction seek characteristics indicative of possible protein function. One approach is to identify substructural matches of geometric and chemical similarity between motifs representing known active sites and target protein structures with unknown function. In earlier work, statistically significant matches of certain effective motifs have identified functionally related active sites. Effective motifs must be carefully designed to maintain similarity to functionally related sites (sensitivity) and avoid incidental similarities to functionally unrelated protein geometry (specificity). Existing motif design techniques use the geometry of a single protein structure. Poor selection of this structure can limit motif effectiveness if the selected functional site lacks similarity to functionally related sites. To address this problem, this paper presents composite motifs, which combine structures of functionally related active sites to potentially increase sensitivity. Our experimentation compares the effectiveness of composite motifs with simple motifs designed from single protein structures. On six distinct families of functionally related proteins, leave-one-out testing showed that composite motifs had sensitivity comparable to the most sensitive of all simple motifs and specificity comparable to the average simple motif. On our data set, we observed that composite motifs simultaneously capture variations in active site conformation, diminish the problem of selecting motif structures, and enable the fusion of protein structures from diverse data sources. PMID:17951837

  19. Exploring water binding motifs to an excess electron via X2(-)(H2O) [X = O, F].

    PubMed

    Chiou, Mong-Feng; Sheu, Wen-Shyan

    2012-07-26

    X(2)(-)(H(2)O) [X = O, F] is utilized to explore water binding motifs to an excess electron via ab initio calculations at the MP4(SDQ)/aug-cc-pVDZ + diffs(2s2p,2s2p) level of theory. X(2)(-)(H(2)O) can be regarded as a water molecule that binds to an excess electron, the distribution of which is gauged by X(2). By varying the interatomic distance of X(2), r(X1-X2), the distribution of the excess electron is altered, and the water binding motifs to the excess electron is then examined. Depending on r(X1-X2), both binding motifs of C(s) and C(2v) forms are found with a critical distance of ∼1.37 Å and ∼1.71 Å for O(2)(-)(H(2)O) and F(2)(-)(H(2)O), respectively. The energetic and geometrical features of O(2)(-)(H(2)O) and F(2)(-)(H(2)O) are compared. In addition, various electronic properties of X(2)(-)(H(2)O) are examined. For both O(2)(-)(H(2)O) and F(2)(-)(H(2)O), the C(s) binding motif appears to prevail at a compact distribution of the excess electron. However, when the electron is diffuse, characterized by the radius of gyration in the direction of the X(2) bond axis with a threshold of ∼0.84 Å, the C(2v) binding motif is formed. PMID:22762788

  20. The RNA recognition motif domains of RBM5 are required for RNA binding and cancer cell proliferation inhibition

    SciTech Connect

    Zhang, Lei; Zhang, Qing; Yang, Yu; Wu, Chuanfang

    2014-02-14

    Highlights: • RNA recognition motif domains of RBM5 are essential for cell proliferation inhibition. • RNA recognition motif domains of RBM5 are essential for apoptosis induction. • RNA recognition motif domains of RBM5 are essential for RNA binding. • RNA recognition motif domains of RBM5 are essential for caspase-2 alternative splicing. - Abstract: RBM5 is a known putative tumor suppressor gene that has been shown to function in cell growth inhibition by modulating apoptosis. RBM5 also plays a critical role in alternative splicing as an RNA binding protein. However, it is still unclear which domains of RBM5 are required for RNA binding and related functional activities. We hypothesized the two putative RNA recognition motif (RRM) domains of RBM5 spanning from amino acids 98–178 and 231–315 are essential for RBM5-mediated cell growth inhibition, apoptosis regulation, and RNA binding. To investigate this hypothesis, we evaluated the activities of the wide-type and mutant RBM5 gene transfer in low-RBM5 expressing A549 cells. We found that, unlike wild-type RBM5 (RBM5-wt), a RBM5 mutant lacking the two RRM domains (RBM5-ΔRRM), is unable to bind RNA, has compromised caspase-2 alternative splicing activity, lacks cell proliferation inhibition and apoptosis induction function in A549 cells. These data provide direct evidence that the two RRM domains of RBM5 are required for RNA binding and the RNA binding activity of RBM5 contributes to its function on apoptosis induction and cell growth inhibition.

  1. Glycines from the APP GXXXG/GXXXA Transmembrane Motifs Promote Formation of Pathogenic Aβ Oligomers in Cells.

    PubMed

    Decock, Marie; Stanga, Serena; Octave, Jean-Noël; Dewachter, Ilse; Smith, Steven O; Constantinescu, Stefan N; Kienlen-Campard, Pascal

    2016-01-01

    Alzheimer's disease (AD) is the most common neurodegenerative disorder characterized by progressive cognitive decline leading to dementia. The amyloid precursor protein (APP) is a ubiquitous type I transmembrane (TM) protein sequentially processed to generate the β-amyloid peptide (Aβ), the major constituent of senile plaques that are typical AD lesions. There is a growing body of evidence that soluble Aβ oligomers correlate with clinical symptoms associated with the disease. The Aβ sequence begins in the extracellular juxtamembrane region of APP and includes roughly half of the TM domain. This region contains GXXXG and GXXXA motifs, which are critical for both TM protein interactions and fibrillogenic properties of peptides derived from TM α-helices. Glycine-to-leucine mutations of these motifs were previously shown to affect APP processing and Aβ production in cells. However, the detailed contribution of these motifs to APP dimerization, their relation to processing, and the conformational changes they can induce within Aβ species remains undefined. Here, we describe highly resistant Aβ42 oligomers that are produced in cellular membrane compartments. They are formed in cells by processing of the APP amyloidogenic C-terminal fragment (C99), or by direct expression of a peptide corresponding to Aβ42, but not to Aβ40. By a point-mutation approach, we demonstrate that glycine-to-leucine mutations in the G(29)XXXG(33) and G(38)XXXA(42) motifs dramatically affect the Aβ oligomerization process. G33 and G38 in these motifs are specifically involved in Aβ oligomerization; the G33L mutation strongly promotes oligomerization, while G38L blocks it with a dominant effect on G33 residue modification. Finally, we report that the secreted Aβ42 oligomers display pathological properties consistent with their suggested role in AD, but do not induce toxicity in survival assays with neuronal cells. Exposure of neurons to these Aβ42 oligomers dramatically affects

  2. Glycines from the APP GXXXG/GXXXA Transmembrane Motifs Promote Formation of Pathogenic Aβ Oligomers in Cells

    PubMed Central

    Decock, Marie; Stanga, Serena; Octave, Jean-Noël; Dewachter, Ilse; Smith, Steven O.; Constantinescu, Stefan N.; Kienlen-Campard, Pascal

    2016-01-01

    Alzheimer’s disease (AD) is the most common neurodegenerative disorder characterized by progressive cognitive decline leading to dementia. The amyloid precursor protein (APP) is a ubiquitous type I transmembrane (TM) protein sequentially processed to generate the β-amyloid peptide (Aβ), the major constituent of senile plaques that are typical AD lesions. There is a growing body of evidence that soluble Aβ oligomers correlate with clinical symptoms associated with the disease. The Aβ sequence begins in the extracellular juxtamembrane region of APP and includes roughly half of the TM domain. This region contains GXXXG and GXXXA motifs, which are critical for both TM protein interactions and fibrillogenic properties of peptides derived from TM α-helices. Glycine-to-leucine mutations of these motifs were previously shown to affect APP processing and Aβ production in cells. However, the detailed contribution of these motifs to APP dimerization, their relation to processing, and the conformational changes they can induce within Aβ species remains undefined. Here, we describe highly resistant Aβ42 oligomers that are produced in cellular membrane compartments. They are formed in cells by processing of the APP amyloidogenic C-terminal fragment (C99), or by direct expression of a peptide corresponding to Aβ42, but not to Aβ40. By a point-mutation approach, we demonstrate that glycine-to-leucine mutations in the G29XXXG33 and G38XXXA42 motifs dramatically affect the Aβ oligomerization process. G33 and G38 in these motifs are specifically involved in Aβ oligomerization; the G33L mutation strongly promotes oligomerization, while G38L blocks it with a dominant effect on G33 residue modification. Finally, we report that the secreted Aβ42 oligomers display pathological properties consistent with their suggested role in AD, but do not induce toxicity in survival assays with neuronal cells. Exposure of neurons to these Aβ42 oligomers dramatically affects neuronal

  3. FPGA implementation of motifs-based neuronal network and synchronization analysis

    NASA Astrophysics Data System (ADS)

    Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao

    2016-06-01

    Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.

  4. Characterization of an RNA receptor motif that recognizes a GCGA tetraloop.

    PubMed

    Furukawa, Airi; Maejima, Takaya; Matsumura, Shigeyoshi; Ikawa, Yoshiya

    2016-07-01

    Tertiary interactions between a new RNA motif and RNA tetraloops were analyzed to determine whether this new motif shows preference for a GCGA tetraloop. In the structural context of a ligase ribozyme, this motif discriminated GCGA loop from 3 other tetraloops. The affinity between the GCGA loop and its receptor is strong enough to carry out the ribozyme activity. PMID:26967268

  5. Identifiability and inference of pathway motifs by epistasis analysis.

    PubMed

    Phenix, Hilary; Perkins, Theodore; Kærn, Mads

    2013-06-01

    The accuracy of genetic network inference is limited by the assumptions used to determine if one hypothetical model is better than another in explaining experimental observations. Most previous work on epistasis analysis-in which one attempts to infer pathway relationships by determining equivalences among traits following mutations-has been based on Boolean or linear models. Here, we delineate the ultimate limits of epistasis-based inference by systematically surveying all two-gene network motifs and use symbolic algebra with arbitrary regulation functions to examine trait equivalences. Our analysis divides the motifs into equivalence classes, where different genetic perturbations result in indistinguishable experimental outcomes. We demonstrate that this partitioning can reveal important information about network architecture, and show, using simulated data, that it greatly improves the accuracy of genetic network inference methods. Because of the minimal assumptions involved, equivalence partitioning has broad applicability for gene network inference. PMID:23822501

  6. Distance conservation of transcriptional and splicing regulatory motifs

    NASA Astrophysics Data System (ADS)

    Lu, Jun; Ding, Changjiang

    2012-09-01

    The distance conservation is a new kind of genomic evolutionary conservation. The transcriptional and splicing regulatory k-mer motifs are functionally important DNA sequence elements. We demonstrated that there exist the evolutionarily conservation of the distance between these k-mer pairs in genomic sequences. This kind of conservation is not based on the strict location of bases in genome sequences, and does not depend on excess frequency of occurrence of k-mers. By utilizing the conservation of k-mer distance it is possible to design a non-alignment-based approach to quickly identify transcriptional or splicing regulatory motifs on the genome-wide scale. In this paper we will summarize our previous studies on distance conservation, introduce the method of distance conservation and indicate the prospects of its application.

  7. A new motif for inhibitors of geranylgeranyl diphosphate synthase.

    PubMed

    Foust, Benjamin J; Allen, Cheryl; Holstein, Sarah A; Wiemer, David F

    2016-08-15

    The enzyme geranylgeranyl diphosphate synthase (GGDPS) is believed to receive the substrate farnesyl diphosphate through one lipophilic channel and release the product geranylgeranyl diphosphate through another. Bisphosphonates with two isoprenoid chains positioned on the α-carbon have proven to be effective inhibitors of this enzyme. Now a new motif has been prepared with one isoprenoid chain on the α-carbon, a second included as a phosphonate ester, and the potential for a third at the α-carbon. The pivaloyloxymethyl prodrugs of several compounds based on this motif have been prepared and the resulting compounds have been tested for their ability to disrupt protein geranylgeranylation and induce cytotoxicity in myeloma cells. The initial biological studies reveal activity consistent with GGDPS inhibition, and demonstrate a structure-function relationship which is dependent on the nature of the alkyl group at the α-carbon. PMID:27338660

  8. Identifiability and inference of pathway motifs by epistasis analysis

    NASA Astrophysics Data System (ADS)

    Phenix, Hilary; Perkins, Theodore; Kærn, Mads

    2013-06-01

    The accuracy of genetic network inference is limited by the assumptions used to determine if one hypothetical model is better than another in explaining experimental observations. Most previous work on epistasis analysis—in which one attempts to infer pathway relationships by determining equivalences among traits following mutations—has been based on Boolean or linear models. Here, we delineate the ultimate limits of epistasis-based inference by systematically surveying all two-gene network motifs and use symbolic algebra with arbitrary regulation functions to examine trait equivalences. Our analysis divides the motifs into equivalence classes, where different genetic perturbations result in indistinguishable experimental outcomes. We demonstrate that this partitioning can reveal important information about network architecture, and show, using simulated data, that it greatly improves the accuracy of genetic network inference methods. Because of the minimal assumptions involved, equivalence partitioning has broad applicability for gene network inference.

  9. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an object-oriented'' approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  10. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an ``object-oriented`` approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  11. Biosynthesis of caffeine underlying the diversity of motif B' methyltransferase.

    PubMed

    Nakayama, Fumiyo; Mizuno, Kouichi; Kato, Misako

    2015-05-01

    Caffeine (1,3,7-trimethylxanthine) and theobromine (3,7-dimethylxanthine) are well-known purine alkaloids in Camellia, Coffea, Cola, Paullinia, Ilex, and Theobroma spp. The caffeine biosynthetic pathway depends on the substrate specificity of N-methyltransferases, which are members of the motif B' methyl-transferase family. The caffeine biosynthetic pathways in purine alkaloid-containing plants might have evolved in parallel with one another, consistent with different catalytic properties of the enzymes involved in these pathways. PMID:26058161

  12. Graph animals, subgraph sampling, and motif search in large networks

    NASA Astrophysics Data System (ADS)

    Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2007-09-01

    We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for “graph animals,” i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan , Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of superexponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the tandem affinity purification (TAP) method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs ( Z scores >10 ) or antimotifs ( Z scores <-10 ) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes that connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the k cores with large k and have a decisive effect on the strongest motifs with 6-8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.

  13. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    PubMed Central

    2012-01-01

    Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We

  14. Linear motifs confer functional diversity onto splice variants

    PubMed Central

    Weatheritt, Robert J.; Davey, Norman E.; Gibson, Toby J.

    2012-01-01

    The pre-translational modification of messenger ribonucleic acids (mRNAs) by alternative promoter usage and alternative splicing is an important source of pleiotropy. Despite intensive efforts, our understanding of the functional implications of this dynamically created diversity is still incomplete. Using the available knowledge of interaction modules, particularly within intrinsically disordered regions (IDRs), we analysed the occurrences of protein modules within alternative exons. We find that regions removed or included by pre-translational variation are enriched in linear motifs suggesting that the removal or inclusion of exons containing these interaction modules is an important regulatory mechanism. In particular, we observe that PDZ-, PTB-, SH2- and WW-domain binding motifs are more likely to occur within alternative exons. We also determine that regions removed or included by alternative promoter usage are enriched in IDRs suggesting that protein isoform diversity is tightly coupled to the modulation of IDRs. This study, therefore, demonstrates that short linear motifs are key components for establishing protein diversity between splice variants. PMID:22638587

  15. Structure and ubiquitin binding of the ubiquitin-interacting motif

    SciTech Connect

    Fisher,R.; Wang, B.; Alam, S.; Higginson, D.; Robinson, H.; Sundquist, C.; Hill, C.

    2003-01-01

    Ubiquitylation is used to target proteins into a large number of different biological processes including proteasomal degradation, endocytosis, virus budding, and vacuolar protein sorting (Vps). Ubiquitylated proteins are typically recognized using one of several different conserved ubiquitin binding modules. Here, we report the crystal structure and ubiquitin binding properties of one such module, the ubiquitin-interacting motif (UIM). We found that UIM peptides from several proteins involved in endocytosis and vacuolar protein sorting including Hrs, Vps27p, Stam1, and Eps15 bound specifically, but with modest affinity (K{sub d} = 0.1-1 mM), to free ubiquitin. Full affinity ubiquitin binding required the presence of conserved acidic patches at the N and C terminus of the UIM, as well as highly conserved central alanine and serine residues. NMR chemical shift perturbation mapping experiments demonstrated that all of these UIM peptides bind to the I44 surface of ubiquitin. The 1.45 {angstrom} resolution crystal structure of the second yeast Vps27p UIM (Vps27p-2) revealed that the ubiquitin-interacting motif forms an amphipathic helix. Although Vps27p-2 is monomeric in solution, the motif unexpectedly crystallized as an antiparallel four-helix bundle, and the potential biological implications of UIM oligomerization are therefore discussed.

  16. Maximum likelihood density modification by pattern recognition of structural motifs

    DOEpatents

    Terwilliger, Thomas C.

    2004-04-13

    An electron density for a crystallographic structure having protein regions and solvent regions is improved by maximizing the log likelihood of a set of structures factors {F.sub.h } using a local log-likelihood function: (x)+p(.rho.(x).vertline.SOLV)p.sub.SOLV (x)+p(.rho.(x).vertline.H)p.sub.H (x)], where p.sub.PROT (x) is the probability that x is in the protein region, p(.rho.(x).vertline.PROT) is the conditional probability for .rho.(x) given that x is in the protein region, and p.sub.SOLV (x) and p(.rho.(x).vertline.SOLV) are the corresponding quantities for the solvent region, p.sub.H (x) refers to the probability that there is a structural motif at a known location, with a known orientation, in the vicinity of the point x; and p(.rho.(x).vertline.H) is the probability distribution for electron density at this point given that the structural motif actually is present. One appropriate structural motif is a helical structure within the crystallographic structure.

  17. TOPDOM: database of conservatively located domains and motifs in proteins

    PubMed Central

    Varga, Julia; Dobson, László; Tusnády, Gábor E.

    2016-01-01

    Summary: The TOPDOM database—originally created as a collection of domains and motifs located consistently on the same side of the membranes in α-helical transmembrane proteins—has been updated and extended by taking into consideration consistently localized domains and motifs in globular proteins, too. By taking advantage of the recently developed CCTOP algorithm to determine the type of a protein and predict topology in case of transmembrane proteins, and by applying a thorough search for domains and motifs as well as utilizing the most up-to-date version of all source databases, we managed to reach a 6-fold increase in the size of the whole database and a 2-fold increase in the number of transmembrane proteins. Availability and implementation: TOPDOM database is available at http://topdom.enzim.hu. The webpage utilizes the common Apache, PHP5 and MySQL software to provide the user interface for accessing and searching the database. The database itself is generated on a high performance computer. Contact: tusnady.gabor@ttk.mta.hu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153630

  18. Event Networks and the Identification of Crime Pattern Motifs.

    PubMed

    Davies, Toby; Marchione, Elio

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible. PMID:26605544

  19. Motif structure and cooperation in real-world complex networks

    NASA Astrophysics Data System (ADS)

    Salehi, Mostafa; Rabiee, Hamid R.; Jalili, Mahdi

    2010-12-01

    Networks of dynamical nodes serve as generic models for real-world systems in many branches of science ranging from mathematics to physics, technology, sociology and biology. Collective behavior of agents interacting over complex networks is important in many applications. The cooperation between selfish individuals is one of the most interesting collective phenomena. In this paper we address the interplay between the motifs’ cooperation properties and their abundance in a number of real-world networks including yeast protein-protein interaction, human brain, protein structure, email communication, dolphins’ social interaction, Zachary karate club and Net-science coauthorship networks. First, the amount of cooperativity for all possible undirected subgraphs with three to six nodes is calculated. To this end, the evolutionary dynamics of the Prisoner’s Dilemma game is considered and the cooperativity of each subgraph is calculated as the percentage of cooperating agents at the end of the simulation time. Then, the three- to six-node motifs are extracted for each network. The significance of the abundance of a motif, represented by a Z-value, is obtained by comparing them with some properly randomized versions of the original network. We found that there is always a group of motifs showing a significant inverse correlation between their cooperativity amount and Z-value, i.e. the more the Z-value the less the amount of cooperativity. This suggests that networks composed of well-structured units do not have good cooperativity properties.

  20. Event Networks and the Identification of Crime Pattern Motifs

    PubMed Central

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible. PMID:26605544

  1. GxxxG motifs hold the TIM23 complex together.

    PubMed

    Demishtein-Zohary, Keren; Marom, Milit; Neupert, Walter; Mokranjac, Dejana; Azem, Abdussalam

    2015-06-01

    Approximately 99% of the mitochondrial proteome is nucleus-encoded, synthesized in the cytosol, and subsequently imported into and sorted to the correct compartment in the organelle. The translocase of the inner mitochondrial membrane 23 (TIM23) complex is the major protein translocase of the inner membrane, and is responsible for translocation of proteins across the inner membrane and their insertion into the inner membrane. Tim23 is the central component of the complex that forms the import channel. A high-resolution structure of the import channel is still missing, and structural elements important for its function are unknown. In the present study, we analyzed the importance of the highly abundant GxxxG motifs in the transmembrane segments of Tim23 for the structural integrity of the TIM23 complex. Of 10 glycines present in the GxxxG motifs in the first, second and third transmembrane segments of Tim23, mutations of three of them in transmembrane segments 1 and 2 resulted in a lethal phenotype, and mutations of three others in a temperature-sensitive phenotype. The remaining four caused no obvious growth phenotype. Importantly, none of the mutations impaired the import and membrane integration of Tim23 precursor into mitochondria. However, the severity of growth impairment correlated with the destabilization of the TIM23 complex. We conclude that the GxxxG motifs found in the first and second transmembrane segments of Tim23 are necessary for the structural integrity of the TIM23 complex. PMID:25765297

  2. QuateXelero: An Accelerated Exact Network Motif Detection Algorithm

    PubMed Central

    Khakabimamaghani, Sahand; Sharafuddin, Iman; Dichter, Norbert; Koch, Ina; Masoudi-Nejad, Ali

    2013-01-01

    Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network. PMID:23874498

  3. An update on cell surface proteins containing extensin-motifs.

    PubMed

    Borassi, Cecilia; Sede, Ana R; Mecchia, Martin A; Salgado Salter, Juan D; Marzol, Eliana; Muschietti, Jorge P; Estevez, Jose M

    2016-01-01

    In recent years it has become clear that there are several molecular links that interconnect the plant cell surface continuum, which is highly important in many biological processes such as plant growth, development, and interaction with the environment. The plant cell surface continuum can be defined as the space that contains and interlinks the cell wall, plasma membrane and cytoskeleton compartments. In this review, we provide an updated view of cell surface proteins that include modular domains with an extensin (EXT)-motif followed by a cytoplasmic kinase-like domain, known as PERKs (for proline-rich extensin-like receptor kinases); with an EXT-motif and an actin binding domain, known as formins; and with extracellular hybrid-EXTs. We focus our attention on the EXT-motifs with the short sequence Ser-Pro(3-5), which is found in several different protein contexts within the same extracellular space, highlighting a putative conserved structural and functional role. A closer understanding of the dynamic regulation of plant cell surface continuum and its relationship with the downstream signalling cascade is a crucial forthcoming challenge. PMID:26475923

  4. QuateXelero: an accelerated exact network motif detection algorithm.

    PubMed

    Khakabimamaghani, Sahand; Sharafuddin, Iman; Dichter, Norbert; Koch, Ina; Masoudi-Nejad, Ali

    2013-01-01

    Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks' structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network. PMID:23874498

  5. The mammalian heterochromatin protein 1 binds diverse nuclear proteins through a common motif that targets the chromoshadow domain

    SciTech Connect

    Lechner, Mark S. . E-mail: msl27@drexel.edu; Schultz, David C.; Negorev, Dmitri; Maul, Gerd G.; Rauscher, Frank J.

    2005-06-17

    The HP1 proteins regulate epigenetic gene silencing by promoting and maintaining chromatin condensation. The HP1 chromodomain binds to methylated histone H3. More enigmatic is the chromoshadow domain (CSD), which mediates dimerization, transcription repression, and interaction with multiple nuclear proteins. Here we show that KAP-1, CAF-1 p150, and NIPBL carry a canonical amino acid motif, PxVxL, which binds directly to the CSD with high affinity. We also define a new class of variant PxVxL CSD-binding motifs in Sp100A, LBR, and ATRX. Both canonical and variant motifs recognize a similar surface of the CSD dimer as demonstrated by a panel of CSD mutants. These in vitro binding results were confirmed by the analysis of polypeptides found associated with nuclear HP1 complexes and we provide the first evidence of the NIPBL/delangin protein in human cells, a protein recently implicated in the developmental disorder, Cornelia de Lange syndrome. NIPBL is related to Nipped-B, a factor participating in gene activation by remote enhancers in Drosophila melanogaster. Thus, this spectrum of direct binding partners suggests an expanded role for HP1 as factor participating in promoter-enhancer communication, chromatin remodeling/assembly, and sub-nuclear compartmentalization.

  6. A conserved secondary structural motif in 23S rRNA defines the site of interaction of amicetin, a universal inhibitor of peptide bond formation.

    PubMed Central

    Leviev, I G; Rodriguez-Fonseca, C; Phan, H; Garrett, R A; Heilek, G; Noller, H F; Mankin, A S

    1994-01-01

    The binding site and probable site of action have been determined for the universal antibiotic amicetin which inhibits peptide bond formation. Evidence from in vivo mutants, site-directed mutations and chemical footprinting all implicate a highly conserved motif in the secondary structure of the 23S-like rRNA close to the central circle of domain V. We infer that this motif lies at, or close to, the catalytic site in the peptidyl transfer centre. The binding site of amicetin is the first of a group of functionally related hexose-cytosine inhibitors to be localized on the ribosome. Images PMID:8157007

  7. An Isoprenylation and Palmitoylation Motif Promotes Intraluminal Vesicle Delivery of Proteins in Cells from Distant Species

    PubMed Central

    Oeste, Clara L.; Pinar, Mario; Schink, Kay O.; Martínez-Turrión, Javier; Stenmark, Harald; Peñalva, Miguel A.; Pérez-Sala, Dolores

    2014-01-01

    The C-terminal ends of small GTPases contain hypervariable sequences which may be posttranslationally modified by defined lipid moieties. The diverse structural motifs generated direct proteins towards specific cellular membranes or organelles. However, knowledge on the factors that determine these selective associations is limited. Here we show, using advanced microscopy, that the isoprenylation and palmitoylation motif of human RhoB (–CINCCKVL) targets chimeric proteins to intraluminal vesicles of endolysosomes in human cells, displaying preferential co-localization with components of the late endocytic pathway. Moreover, this distribution is conserved in distant species, including cells from amphibians, insects and fungi. Blocking lipidic modifications results in accumulation of CINCCKVL chimeras in the cytosol, from where they can reach endolysosomes upon release of this block. Remarkably, CINCCKVL constructs are sorted to intraluminal vesicles in a cholesterol-dependent process. In the lower species, neither the C-terminal sequence of RhoB, nor the endosomal distribution of its homologs are conserved; in spite of this, CINCCKVL constructs also reach endolysosomes in Xenopus laevis and insect cells. Strikingly, this behavior is prominent in the filamentous ascomycete fungus Aspergillus nidulans, in which GFP-CINCCKVL is sorted into endosomes and vacuoles in a lipidation-dependent manner and allows monitoring endosomal movement in live fungi. In summary, the isoprenylated and palmitoylated CINCCKVL sequence constitutes a specific structure which delineates an endolysosomal sorting strategy operative in phylogenetically diverse organisms. PMID:25207810

  8. Spontaneous cortical activity alternates between motifs defined by regional axonal projections

    PubMed Central

    Mohajerani, Majid H.; Chan, Allen W.; Mohsenvand, Mostafa; LeDue, Jeffrey; Liu, Rui; McVea, David A.; Boyd, Jamie D.; Wang, Yu Tian; Reimers, Mark; Murphy, Timothy H.

    2014-01-01

    In lightly anaesthetized or awake adult mice using millisecond timescale voltage sensitive dye imaging, we show that a palette of sensory-evoked and hemisphere-wide activity motifs are represented in spontaneous activity. These motifs can reflect multiple modes of sensory processing including vision, audition, and touch. Similar cortical networks were found with direct cortical activation using channelrhodopsin-2. Regional analysis of activity spread indicated modality specific sources such as primary sensory areas, and a common posterior-medial cortical sink where sensory activity was extinguished within the parietal association area, and a secondary anterior medial sink within the cingulate/secondary motor cortices for visual stimuli. Correlation analysis between functional circuits and intracortical axonal projections indicated a common framework corresponding to long-range mono-synaptic connections between cortical regions. Maps of intracortical mono-synaptic structural connections predicted hemisphere-wide patterns of spontaneous and sensory-evoked depolarization. We suggest that an intracortical monosynaptic connectome shapes the ebb and flow of spontaneous cortical activity. PMID:23974708

  9. A Catalytically Essential Motif in External Loop 5 of the Bacterial Oligosaccharyltransferase PglB*

    PubMed Central

    Lizak, Christian; Gerber, Sabina; Zinne, Daria; Michaud, Gaëlle; Schubert, Mario; Chen, Fan; Bucher, Monika; Darbre, Tamis; Zenobi, Renato; Reymond, Jean-Louis; Locher, Kaspar P.

    2014-01-01

    Asparagine-linked glycosylation is a post-translational protein modification that is conserved in all domains of life. The initial transfer of a lipid-linked oligosaccharide (LLO) onto acceptor asparagines is catalyzed by the integral membrane protein oligosaccharyltransferase (OST). The previously reported structure of a single-subunit OST enzyme, the Campylobacter lari protein PglB, revealed a partially disordered external loop (EL5), whose role in catalysis was unclear. We identified a new and functionally important sequence motif in EL5 containing a conserved tyrosine residue (Tyr293) whose aromatic side chain is essential for catalysis. A synthetic peptide containing the conserved motif can partially but specifically rescue in vitro activity of mutated PglB lacking Tyr293. Using site-directed disulfide cross-linking, we show that disengagement of the structurally ordered part of EL5 is an essential step of the glycosylation reaction, probably by allowing sequon binding or glyco-product release. Our findings define two distinct mechanistic roles of EL5 in OST-catalyzed glycosylation. These functions, exerted by the two halves of EL5, are independent, because the loop can be cleaved by specific proteolysis with only slight reduction in activity. PMID:24275651

  10. A catalytically essential motif in external loop 5 of the bacterial oligosaccharyltransferase PglB.

    PubMed

    Lizak, Christian; Gerber, Sabina; Zinne, Daria; Michaud, Gaëlle; Schubert, Mario; Chen, Fan; Bucher, Monika; Darbre, Tamis; Zenobi, Renato; Reymond, Jean-Louis; Locher, Kaspar P

    2014-01-10

    Asparagine-linked glycosylation is a post-translational protein modification that is conserved in all domains of life. The initial transfer of a lipid-linked oligosaccharide (LLO) onto acceptor asparagines is catalyzed by the integral membrane protein oligosaccharyltransferase (OST). The previously reported structure of a single-subunit OST enzyme, the Campylobacter lari protein PglB, revealed a partially disordered external loop (EL5), whose role in catalysis was unclear. We identified a new and functionally important sequence motif in EL5 containing a conserved tyrosine residue (Tyr293) whose aromatic side chain is essential for catalysis. A synthetic peptide containing the conserved motif can partially but specifically rescue in vitro activity of mutated PglB lacking Tyr293. Using site-directed disulfide cross-linking, we show that disengagement of the structurally ordered part of EL5 is an essential step of the glycosylation reaction, probably by allowing sequon binding or glyco-product release. Our findings define two distinct mechanistic roles of EL5 in OST-catalyzed glycosylation. These functions, exerted by the two halves of EL5, are independent, because the loop can be cleaved by specific proteolysis with only slight reduction in activity. PMID:24275651

  11. Kalata B8, a novel antiviral circular protein, exhibits conformational flexibility in the cystine knot motif.

    PubMed

    Daly, Norelle L; Clark, Richard J; Plan, Manuel R; Craik, David J

    2006-02-01

    The cyclotides are a family of circular proteins with a range of biological activities and potential pharmaceutical and agricultural applications. The biosynthetic mechanism of cyclization is unknown and the discovery of novel sequences may assist in achieving this goal. In the present study, we have isolated a new cyclotide from Oldenlandia affinis, kalata B8, which appears to be a hybrid of the two major subfamilies (Möbius and bracelet) of currently known cyclotides. We have determined the three-dimensional structure of kalata B8 and observed broadening of resonances directly involved in the cystine knot motif, suggesting flexibility in this region despite it being the core structural element of the cyclotides. The cystine knot motif is widespread throughout Nature and inherently stable, making this apparent flexibility a surprising result. Furthermore, there appears to be isomerization of the peptide backbone at an Asp-Gly sequence in the region involved in the cyclization process. Interestingly, such isomerization has been previously characterized in related cyclic knottins from Momordica cochinchinensis that have no sequence similarity to kalata B8 apart from the six conserved cysteine residues and may result from a common mechanism of cyclization. Kalata B8 also provides insight into the structure-activity relationships of cyclotides as it displays anti-HIV activity but lacks haemolytic activity. The 'uncoupling' of these two activities has not previously been observed for the cyclotides and may be related to the unusual hydrophilic nature of the peptide. PMID:16207177

  12. Spontaneous cortical activity alternates between motifs defined by regional axonal projections.

    PubMed

    Mohajerani, Majid H; Chan, Allen W; Mohsenvand, Mostafa; LeDue, Jeffrey; Liu, Rui; McVea, David A; Boyd, Jamie D; Wang, Yu Tian; Reimers, Mark; Murphy, Timothy H

    2013-10-01

    Using millisecond-timescale voltage-sensitive dye imaging in lightly anesthetized or awake adult mice, we show that a palette of sensory-evoked and hemisphere-wide activity motifs are represented in spontaneous activity. These motifs can reflect multiple modes of sensory processing, including vision, audition and touch. We found similar cortical networks with direct cortical activation using channelrhodopsin-2. Regional analysis of activity spread indicated modality-specific sources, such as primary sensory areas, a common posterior-medial cortical sink where sensory activity was extinguished within the parietal association area and a secondary anterior medial sink within the cingulate and secondary motor cortices for visual stimuli. Correlation analysis between functional circuits and intracortical axonal projections indicated a common framework corresponding to long-range monosynaptic connections between cortical regions. Maps of intracortical monosynaptic structural connections predicted hemisphere-wide patterns of spontaneous and sensory-evoked depolarization. We suggest that an intracortical monosynaptic connectome shapes the ebb and flow of spontaneous cortical activity. PMID:23974708

  13. MECHANISM AND A PEPTIDE MOTIF FOR TARGETING PERIPHERAL PROTEINS TO THE YEAST INNER NUCLEAR MEMBRANE

    PubMed Central

    Lai, Tsung-Po; Stauffer, Karen A.; Murthi, Athulaprabha; Shaheen, Hussam H.; Peng, Gang; Martin, Nancy C.; Hopper, Anita K.

    2009-01-01

    Trm1 is a tRNA specific m22G methyltransferase shared by nuclei and mitochondria in Saccharomyces cerevisiae. In nuclei Trm1 is peripherally associated with the inner nuclear membrane (INM). We investigated the mechanism delivering/tethering Trm1 to the INM. Analyses of mutations of the Ran pathway and nuclear pore components showed that Trm1 accesses the nucleoplasm via the classical nuclear import pathway. We identified a Trm1 cis-acting sequence sufficient to target passenger proteins to the INM. Detailed mutagenesis of this region uncovered specific amino acids necessary for authentic Trm1 to locate at the INM. The INM information is contained within a sequence of <20 amino acids, defining the first motif for addressing a peripheral protein to this important subnuclear location. The combined studies provide a multi-step process to direct Trm1 to the INM: (1) translation in the cytoplasm; (2) Ran-dependent import into the nucleoplasm; and (3) redistribution from the nucleoplasm to the INM via the INM motif. Furthermore, we demonstrate that the Trm1 mitochondrial targeting and nuclear localization signals are in competition with each other, as Trm1 becomes mitochondrial if prevented from entering the nucleus. PMID:19602197

  14. Structure of a (Cys3His) zinc ribbon, a ubiquitous motif in archaeal and eucaryal transcription.

    PubMed

    Chen, H T; Legault, P; Glushka, J; Omichinski, J G; Scott, R A

    2000-09-01

    Transcription factor IIB (TFIIB) is an essential component in the formation of the transcription initiation complex in eucaryal and archaeal transcription. TFIIB interacts with a promoter complex containing the TATA-binding protein (TBP) to facilitate interaction with RNA polymerase II (RNA pol II) and the associated transcription factor IIF (TFIIF). TFIIB contains a zinc-binding motif near the N-terminus that is directly involved in the interaction with RNA pol II/TFIIF and plays a crucial role in selecting the transcription initiation site. The solution structure of the N-terminal residues 2-59 of human TFIIB was determined by multidimensional NMR spectroscopy. The structure consists of a nearly tetrahedral Zn(Cys)3(His)1 site confined by type I and "rubredoxin" turns, three antiparallel beta-strands, and disordered loops. The structure is similar to the reported zinc-ribbon motifs in several transcription-related proteins from archaea and eucarya, including Pyrococcus furiosus transcription factor B (PfTFB), human and yeast transcription factor IIS (TFIIS), and Thermococcus celer RNA polymerase II subunit M (TcRPOM). The zinc-ribbon structure of TFIIB, in conjunction with the biochemical analyses, suggests that residues on the beta-sheet are involved in the interaction with RNA pol II/TFIIF, while the zinc-binding site may increase the stability of the beta-sheet. PMID:11045620

  15. Structure of a (Cys3His) zinc ribbon, a ubiquitous motif in archaeal and eucaryal transcription.

    PubMed Central

    Chen, H. T.; Legault, P.; Glushka, J.; Omichinski, J. G.; Scott, R. A.

    2000-01-01

    Transcription factor IIB (TFIIB) is an essential component in the formation of the transcription initiation complex in eucaryal and archaeal transcription. TFIIB interacts with a promoter complex containing the TATA-binding protein (TBP) to facilitate interaction with RNA polymerase II (RNA pol II) and the associated transcription factor IIF (TFIIF). TFIIB contains a zinc-binding motif near the N-terminus that is directly involved in the interaction with RNA pol II/TFIIF and plays a crucial role in selecting the transcription initiation site. The solution structure of the N-terminal residues 2-59 of human TFIIB was determined by multidimensional NMR spectroscopy. The structure consists of a nearly tetrahedral Zn(Cys)3(His)1 site confined by type I and "rubredoxin" turns, three antiparallel beta-strands, and disordered loops. The structure is similar to the reported zinc-ribbon motifs in several transcription-related proteins from archaea and eucarya, including Pyrococcus furiosus transcription factor B (PfTFB), human and yeast transcription factor IIS (TFIIS), and Thermococcus celer RNA polymerase II subunit M (TcRPOM). The zinc-ribbon structure of TFIIB, in conjunction with the biochemical analyses, suggests that residues on the beta-sheet are involved in the interaction with RNA pol II/TFIIF, while the zinc-binding site may increase the stability of the beta-sheet. PMID:11045620

  16. Mutation Conferring Apical-Targeting Motif on AE1 Exchanger Causes Autosomal Dominant Distal RTA

    PubMed Central

    Fry, Andrew C.; Su, Ya; Yiu, Vivian; Cuthbert, Alan W.; Trachtman, Howard

    2012-01-01

    Mutations in SLC4A1 that mislocalize its product, the chloride/bicarbonate exchanger AE1, away from its normal position on the basolateral membrane of the α-intercalated cell cause autosomal dominant distal renal tubular acidosis (dRTA). We studied a family exhibiting dominant inheritance and defined a mutation (AE1-M909T) that affects the C terminus of AE1, a region rich in potential targeting motifs that are incompletely characterized. Expression of AE1-M909T in Xenopus oocytes confirmed preservation of its anion exchange function. Wild-type GFP-tagged AE1 localized to the basolateral membrane of polarized MDCK cells, but AE1-M909T localized to both the apical and basolateral membranes. Wild-type AE1 trafficked directly to the basolateral membrane without apical passage, whereas AE1-M909T trafficked to both cell surfaces, implying the gain of an apical-targeting signal. We found that AE1-M909T acquired class 1 PDZ ligand activity that the wild type did not possess. In summary, the AE1-M909T mutation illustrates the role of abnormal targeting in dRTA and provides insight into C-terminal motifs that govern normal trafficking of AE1. PMID:22518001

  17. Mutation conferring apical-targeting motif on AE1 exchanger causes autosomal dominant distal RTA.

    PubMed

    Fry, Andrew C; Su, Ya; Yiu, Vivian; Cuthbert, Alan W; Trachtman, Howard; Karet Frankl, Fiona E

    2012-07-01

    Mutations in SLC4A1 that mislocalize its product, the chloride/bicarbonate exchanger AE1, away from its normal position on the basolateral membrane of the α-intercalated cell cause autosomal dominant distal renal tubular acidosis (dRTA). We studied a family exhibiting dominant inheritance and defined a mutation (AE1-M909T) that affects the C terminus of AE1, a region rich in potential targeting motifs that are incompletely characterized. Expression of AE1-M909T in Xenopus oocytes confirmed preservation of its anion exchange function. Wild-type GFP-tagged AE1 localized to the basolateral membrane of polarized MDCK cells, but AE1-M909T localized to both the apical and basolateral membranes. Wild-type AE1 trafficked directly to the basolateral membrane without apical passage, whereas AE1-M909T trafficked to both cell surfaces, implying the gain of an apical-targeting signal. We found that AE1-M909T acquired class 1 PDZ ligand activity that the wild type did not possess. In summary, the AE1-M909T mutation illustrates the role of abnormal targeting in dRTA and provides insight into C-terminal motifs that govern normal trafficking of AE1. PMID:22518001

  18. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.

    PubMed

    Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

    2013-01-01

    The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance. PMID:23125372

  19. Structural analysis of cysteine S-nitrosylation: a modified acid-based motif and the emerging role of trans-nitrosylation

    PubMed Central

    Marino, Stefano M.; Gladyshev, Vadim N.

    2009-01-01

    S-nitrosylation, the selective and reversible addition of nitric oxide (NO) moiety to cysteine (Cys) sulfur in proteins, regulates numerous cellular processes. In recent years, proteomic approaches have been developed that are capable of identifying nitrosylated Cys residues. However, the features underlying specificity of Cys modification with NO remain poorly defined. Previous studies suggested that S-nitrosylated Cys may be flanked by an acid-base motif or hydrophobic areas, and show high reactivity, low pKa and high sulfur atom exposure. In the current study, we prepared an extensive, manually curated dataset of proteins with S-nitrosothiols, accounting for a variety of biochemical functions, organisms of origin and physiological responses to NO. Analysis of this generic NO-Cys dataset revealed that proximal acid-base motif, Cys pKa, sulfur atom exposure, Cys conservation or hydrophobicity in the vicinity of the modified Cys do not define the specificity of S-nitrosylation. Instead, this analysis revealed a revised acid-base motif, which is located more distantly to the Cys and has its charged groups exposed. We hypothesize that, rather than being strictly employed for direct activation of Cys, the modified acid-base motif is engaged in protein-protein interactions whereby contributing to trans-nitrosylation as an important and widespread mechanism for reversible modification of Cys with NO moiety. For proteins lacking the revised motif, we discuss alternative mechanisms including a potential role of nitrosoglutathione as a transacting agent. PMID:19854201

  20. The C-terminal portion of the cleaved HT motif is necessary and sufficient to mediate export of proteins from the malaria parasite into its host cell

    PubMed Central

    Tarr, Sarah J; Cryar, Adam; Thalassinos, Konstantinos; Haldar, Kasturi; Osborne, Andrew R

    2013-01-01

    The malaria parasite exports proteins across its plasma membrane and a surrounding parasitophorous vacuole membrane, into its host erythrocyte. Most exported proteins contain a Host Targeting motif (HT motif) that targets them for export. In the parasite secretory pathway, the HT motif is cleaved by the protease plasmepsin V, but the role of the newly generated N-terminal sequence in protein export is unclear. Using a model protein that is cleaved by an exogenous viral protease, we show that the new N-terminal sequence, normally generated by plasmepsin V cleavage, is sufficient to target a protein for export, and that cleavage by plasmepsin V is not coupled directly to the transfer of a protein to the next component in the export pathway. Mutation of the fourth and fifth positions of the HT motif, as well as amino acids further downstream, block or affect the efficiency of protein export indicating that this region is necessary for efficient export. We also show that the fifth position of the HT motif is important for plasmepsin V cleavage. Our results indicate that plasmepsin V cleavage is required to generate a new N-terminal sequence that is necessary and sufficient to mediate protein export by the malaria parasite. PMID:23279267

  1. Multiple Weak Linear Motifs Enhance Recruitment and Processivity in SPOP-Mediated Substrate Ubiquitination.

    PubMed

    Pierce, Wendy K; Grace, Christy R; Lee, Jihun; Nourse, Amanda; Marzahn, Melissa R; Watson, Edmond R; High, Anthony A; Peng, Junmin; Schulman, Brenda A; Mittag, Tanja

    2016-03-27

    Primary sequence motifs, with millimolar affinities for binding partners, are abundant in disordered protein regions. In multivalent interactions, such weak linear motifs can cooperate to recruit binding partners via avidity effects. If linear motifs recruit modifying enzymes, optimal placement of weak motifs may regulate access to modification sites. Weak motifs may thus exert physiological relevance stronger than that suggested by their affinities, but molecular mechanisms of their function are still poorly understood. Herein, we use the N-terminal disordered region of the Hedgehog transcriptional regulator Gli3 (Gli3(1-90)) to determine the role of weak motifs encoded in its primary sequence for the recruitment of its ubiquitin ligase CRL3(SPOP) and the subsequent effect on ubiquitination efficiency. The substrate adaptor SPOP binds linear motifs through its MATH (meprin and TRAF homology) domain and forms higher-order oligomers through its oligomerization domains, rendering SPOP multivalent for its substrates. Gli3 has multiple weak SPOP binding motifs. We map three such motifs in Gli3(1-90), the weakest of which has a millimolar dissociation constant. Multivalency of ligase and substrate for each other facilitates enhanced ligase recruitment and stimulates Gli3(1-90) ubiquitination in in vitro ubiquitination assays. We speculate that the weak motifs enable processivity through avidity effects and by providing steric access to lysine residues that are otherwise not prioritized for polyubiquitination. Weak motifs may generally be employed in multivalent systems to act as gatekeepers regulating post-translational modification. PMID:26475525

  2. A Novel Alignment-Free Method for Comparing Transcription Factor Binding Site Motifs

    PubMed Central

    Xu, Minli; Su, Zhengchang

    2010-01-01

    Background Transcription factor binding site (TFBS) motifs can be accurately represented by position frequency matrices (PFM) or other equivalent forms. We often need to compare TFBS motifs using their PFMs in order to search for similar motifs in a motif database, or cluster motifs according to their binding preference. The majority of current methods for motif comparison involve a similarity metric for column-to-column comparison and a method to find the optimal position alignment between the two compared motifs. In some applications, alignment-free methods might be preferred; however, few such methods with high accuracy have been described. Methodology/Principal Findings Here we describe a novel alignment-free method for quantifying the similarity of motifs using their PFMs by converting PFMs into k-mer vectors. The motifs could then be compared by measuring the similarity among their corresponding k-mer vectors. Conclusions/Significance We demonstrate that our method in general achieves similar performance or outperforms the existing methods for clustering motifs according to their binding preference and identifying similar motifs of transcription factors of the same family. PMID:20098703

  3. Motif-based analysis of large nucleotide data sets using MEME-ChIP

    PubMed Central

    Ma, Wenxiu; Noble, William S; Bailey, Timothy L

    2014-01-01

    MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by cLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix–based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP’s interactive HTML output groups and aligns significant motifs to ease interpretation. this protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods. PMID:24853928

  4. Agonist and antagonist switch DNA motifs recognized by human androgen receptor in prostate cancer

    PubMed Central

    Chen, Zhong; Lan, Xun; Thomas-Ahner, Jennifer M; Wu, Dayong; Liu, Xiangtao; Ye, Zhenqing; Wang, Liguo; Sunkel, Benjamin; Grenade, Cassandra; Chen, Junsheng; Zynger, Debra L; Yan, Pearlly S; Huang, Jiaoti; Nephew, Kenneth P; Huang, Tim H-M; Lin, Shili; Clinton, Steven K; Li, Wei; Jin, Victor X; Wang, Qianben

    2015-01-01

    Human transcription factors recognize specific DNA sequence motifs to regulate transcription. It is unknown whether a single transcription factor is able to bind to distinctly different motifs on chromatin, and if so, what determines the usage of specific motifs. By using a motif-resolution chromatin immunoprecipitation-exonuclease (ChIP-exo) approach, we find that agonist-liganded human androgen receptor (AR) and antagonist-liganded AR bind to two distinctly different motifs, leading to distinct transcriptional outcomes in prostate cancer cells. Further analysis on clinical prostate tissues reveals that the binding of AR to these two distinct motifs is involved in prostate carcinogenesis. Together, these results suggest that unique ligands may switch DNA motifs recognized by ligand-dependent transcription factors in vivo. Our findings also provide a broad mechanistic foundation for understanding ligand-specific induction of gene expression profiles. PMID:25535248

  5. Agonist and antagonist switch DNA motifs recognized by human androgen receptor in prostate cancer.

    PubMed

    Chen, Zhong; Lan, Xun; Thomas-Ahner, Jennifer M; Wu, Dayong; Liu, Xiangtao; Ye, Zhenqing; Wang, Liguo; Sunkel, Benjamin; Grenade, Cassandra; Chen, Junsheng; Zynger, Debra L; Yan, Pearlly S; Huang, Jiaoti; Nephew, Kenneth P; Huang, Tim H-M; Lin, Shili; Clinton, Steven K; Li, Wei; Jin, Victor X; Wang, Qianben

    2015-02-12

    Human transcription factors recognize specific DNA sequence motifs to regulate transcription. It is unknown whether a single transcription factor is able to bind to distinctly different motifs on chromatin, and if so, what determines the usage of specific motifs. By using a motif-resolution chromatin immunoprecipitation-exonuclease (ChIP-exo) approach, we find that agonist-liganded human androgen receptor (AR) and antagonist-liganded AR bind to two distinctly different motifs, leading to distinct transcriptional outcomes in prostate cancer cells. Further analysis on clinical prostate tissues reveals that the binding of AR to these two distinct motifs is involved in prostate carcinogenesis. Together, these results suggest that unique ligands may switch DNA motifs recognized by ligand-dependent transcription factors in vivo. Our findings also provide a broad mechanistic foundation for understanding ligand-specific induction of gene expression profiles. PMID:25535248

  6. CAGEd-oPOSSUM: motif enrichment analysis from CAGE-derived TSSs

    PubMed Central

    Arenillas, David J.; Forrest, Alistair R. R.; Kawaji, Hideya; Lassmann, Timo; Wasserman, Wyeth W.; Mathelier, Anthony

    2016-01-01

    With the emergence of large-scale Cap Analysis of Gene Expression (CAGE) datasets from individual labs and the FANTOM consortium, one can now analyze the cis-regulatory regions associated with gene transcription at an unprecedented level of refinement. By coupling transcription factor binding site (TFBS) enrichment analysis with CAGE-derived genomic regions, CAGEd-oPOSSUM can identify TFs that act as key regulators of genes involved in specific mammalian cell and tissue types. The webtool allows for the analysis of CAGE-derived transcription start sites (TSSs) either provided by the user or selected from ∼1300 mammalian samples from the FANTOM5 project with pre-computed TFBS predicted with JASPAR TF binding profiles. The tool helps power insights into the regulation of genes through the study of the specific usage of TSSs within specific cell types and/or under specific conditions. Availability and Implementation: The CAGEd-oPOSUM web tool is implemented in Perl, MySQL and Apache and is available at http://cagedop.cmmt.ubc.ca/CAGEd_oPOSSUM. Contacts: anthony.mathelier@ncmm.uio.no or wyeth@cmmt.ubc.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27334471

  7. Mutations in the highly conserved GGQ motif of class 1 polypeptide release factors abolish ability of human eRF1 to trigger peptidyl-tRNA hydrolysis.

    PubMed Central

    Frolova, L Y; Tsivkovskii, R Y; Sivolobova, G F; Oparina, N Y; Serpinsky, O I; Blinov, V M; Tatkov, S I; Kisselev, L L

    1999-01-01

    Although the primary structures of class 1 polypeptide release factors (RF1 and RF2 in prokaryotes, eRF1 in eukaryotes) are known, the molecular basis by which they function in translational termination remains obscure. Because all class 1 RFs promote a stop-codon-dependent and ribosome-dependent hydrolysis of peptidyl-tRNAs, one may anticipate that this common function relies on a common structural motif(s). We have compared amino acid sequences of the available class 1 RFs and found a novel, common, unique, and strictly conserved GGQ motif that should be in a loop (coil) conformation as deduced by programs predicting protein secondary structure. Site-directed mutagenesis of the human eRF1 as a representative of class 1 RFs shows that substitution of both glycyl residues in this motif, G183 and G184, causes complete inactivation of the protein as a release factor toward all three stop codons, whereas two adjacent amino acid residues, G181 and R182, are functionally nonessential. Inactive human eRF1 mutants compete in release assays with wild-type eRF1 and strongly inhibit their release activity. Mutations of the glycyl residues in this motif do not affect another function, the ability of eRF1 together with the ribosome to induce GTPase activity of human eRF3, a class 2 RF. We assume that the novel highly conserved GGQ motif is implicated directly or indirectly in the activity of class 1 RFs in translation termination. PMID:10445876

  8. A functional Small Ubiquitin-like Modifier (SUMO) interacting motif (SIM) in the gibberellin hormone receptor GID1 is conserved in cereal crops and disrupting this motif does not abolish hormone dependency of the DELLA-GID1 interaction

    PubMed Central

    Nelis, Stuart; Conti, Lucio; Zhang, Cunjin; Sadanandom, Ari

    2015-01-01

    Plants survive adversity by modulating their growth in response to changing environmental signals. The phytohormone Gibberellic acid (GA) plays a central role in regulating these adaptive responses by stimulating the degradation of growth repressing DELLA proteins which accumulate during stress. The current model for GA signaling describes how this hormone binds to its receptor GID1 so promoting association of GID1 with DELLA, which then undergoes ubiquitin-mediated proteasomal degradation. Recent data revealed that conjugation of DELLAs to the Small Ubiquitin-like Modifier (SUMO) protein enables plants to modulate its abundance during environmental stress. This is achieved by SUMOylated DELLAs sequestering GID1 via its SUMO interacting motif (SIM) allowing non-SUMOylated DELLAs to accumulate leading to growth restraint under stress and potential yield loss. We demonstrate that GID1 proteins across the major cereal crops contain a functional SIM able to bind SUMO1. Site directed mutagenesis and yeast 2 hybrid experiments reveal that it is possible to disrupt the SIM-SUMO interaction motif without affecting the GA dependent DELLA–GID1 interaction and thereby uncoupling SUMO–mediated inhibition from DELLA degradation. Arabidopsis plants overexpressing a SIM mutant allele of GID1 perform better at relieving DELLA restraint than wild–type GID1. This evidence suggests that manipulating the SIM motif in the GA receptor may provide a possible route to developing stress tolerant crops plants. PMID:25761145

  9. DNA nanotechnology based on i-motif structures.

    PubMed

    Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

    2014-06-17

    CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this

  10. Genomic analysis of membrane protein families: abundance and conserved motifs

    PubMed Central

    Liu, Yang; Engelman, Donald M; Gerstein, Mark

    2002-01-01

    Background Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families. Results Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels. Conclusions We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families. PMID:12372142

  11. SURVEY AND SUMMARY: Unusual DNA duplex and hairpin motifs

    PubMed Central

    Chou, Shan-Ho; Chin, Ko-Hsin; Wang, Andrew H.-J.

    2003-01-01

    Single-stranded DNA or double-stranded DNA has the potential to adopt a wide variety of unusual duplex and hairpin motifs in the presence (trans) or absence (cis) of ligands. Several principles for the formation of those unusual structures have been established through the observation of a number of recurring structural motifs associated with different sequences. These include: (i) internal loops of consecutive mismatches can occur in a B-DNA duplex when sheared base pairs are adjacent to each other to confer extensive cross- and intra-strand base stacking; (ii) interdigitated (zipper-like) duplex structures form instead when sheared G·A base pairs a