Science.gov

Sample records for cis-regulatory motif directs

  1. Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs

    PubMed Central

    Ivan, Andra; Halfon, Marc S; Sinha, Saurabh

    2008-01-01

    We consider the problem of predicting cis-regulatory modules without knowledge of motifs. We formulate this problem in a pragmatic setting, and create over 30 new data sets, using Drosophila modules, to use as a 'benchmark'. We propose two new methods for the problem, and evaluate these, as well as two existing methods, on our benchmark. We find that the challenge of predicting cis-regulatory modules ab initio, without any input of relevant motifs, is a realizable goal. PMID:18226245

  2. Using hexamers to predict cis-regulatory motifs in Drosophila

    PubMed Central

    Chan, Bob Y; Kibler, Dennis

    2005-01-01

    Background Cis-regulatory modules (CRMs) are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered. Results We present a simple, efficient method (HexDiff) based only on hexamer frequencies of known CRMs and non-CRM sequence to predict novel CRMs in regulatory systems. On a data set of 16 gap and pair-rule genes containing 52 known CRMs, predictions made by HexDiff had a higher correlation with the known CRMs than several existing CRM prediction algorithms: Ahab, Cluster Buster, MSCAN, MCAST, and LWF. After combining the results of the different algorithms, 10 putative CRMs were identified and are strong candidates for future study. The hexamers used by HexDiff to distinguish between CRMs and non-CRM sequence were also analyzed and were shown to be enriched in regulatory elements. Conclusion HexDiff provides an efficient and effective means for finding new CRMs based on known CRMs, rather than known binding sites. PMID:16253142

  3. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  4. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    NASA Astrophysics Data System (ADS)

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-03-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.

  5. Mutagenesis of GATA motifs controlling the endoderm regulator elt-2 reveals distinct dominant and secondary cis-regulatory elements.

    PubMed

    Du, Lawrence; Tracy, Sharon; Rifkin, Scott A

    2016-04-01

    Cis-regulatory elements (CREs) are crucial links in developmental gene regulatory networks, but in many cases, it can be difficult to discern whether similar CREs are functionally equivalent. We found that despite similar conservation and binding capability to upstream activators, different GATA cis-regulatory motifs within the promoter of the C. elegans endoderm regulator elt-2 play distinctive roles in activating and modulating gene expression throughout development. We fused wild-type and mutant versions of the elt-2 promoter to a gfp reporter and inserted these constructs as single copies into the C. elegans genome. We then counted early embryonic gfp transcripts using single-molecule RNA FISH (smFISH) and quantified gut GFP fluorescence. We determined that a single primary dominant GATA motif located 527bp upstream of the elt-2 start codon was necessary for both embryonic activation and later maintenance of transcription, while nearby secondary GATA motifs played largely subtle roles in modulating postembryonic levels of elt-2. Mutation of the primary activating site increased low-level spatiotemporally ectopic stochastic transcription, indicating that this site acts repressively in non-endoderm cells. Our results reveal that CREs with similar GATA factor binding affinities in close proximity can play very divergent context-dependent roles in regulating the expression of a developmentally critical gene in vivo.

  6. Evolution of New cis-Regulatory Motifs Required for Cell-Specific Gene Expression in Caenorhabditis

    PubMed Central

    Félix, Marie-Anne

    2016-01-01

    Patterning of C. elegans vulval cell fates relies on inductive signaling. In this induction event, a single cell, the gonadal anchor cell, secretes LIN-3/EGF and induces three out of six competent precursor cells to acquire a vulval fate. We previously showed that this developmental system is robust to a four-fold variation in lin-3/EGF genetic dose. Here using single-molecule FISH, we find that the mean level of expression of lin-3 in the anchor cell is remarkably conserved. No change in lin-3 expression level could be detected among C. elegans wild isolates and only a low level of change—less than 30%—in the Caenorhabditis genus and in Oscheius tipulae. In C. elegans, lin-3 expression in the anchor cell is known to require three transcription factor binding sites, specifically two E-boxes and a nuclear-hormone-receptor (NHR) binding site. Mutation of any of these three elements in C. elegans results in a dramatic decrease in lin-3 expression. Yet only a single E-box is found in the Drosophilae supergroup of Caenorhabditis species, including C. angaria, while the NHR-binding site likely only evolved at the base of the Elegans group. We find that a transgene from C. angaria bearing a single E-box is sufficient for normal expression in C. elegans. Even a short 58 bp cis-regulatory fragment from C. angaria with this single E-box is able to replace the three transcription factor binding sites at the endogenous C. elegans lin-3 locus, resulting in the wild-type expression level. Thus, regulatory evolution occurring in cis within a 58 bp lin-3 fragment, results in a strict requirement for the NHR binding site and a second E-box in C. elegans. This single-cell, single-molecule, quantitative and functional evo-devo study demonstrates that conserved expression levels can hide extensive change in cis-regulatory site requirements and highlights the evolution of new cis-regulatory elements required for cell-specific gene expression. PMID:27588814

  7. A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements.

    PubMed

    Henry, Kelli F; Kawashima, Tomokazu; Goldberg, Robert B

    2015-06-01

    Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globular-stage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. A homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.

  8. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  9. Retinoic acid-induced down-regulation of the interleukin-2 promoter via cis-regulatory sequences containing an octamer motif.

    PubMed Central

    Felli, M P; Vacca, A; Meco, D; Screpanti, I; Farina, A R; Maroder, M; Martinotti, S; Petrangeli, E; Frati, L; Gulino, A

    1991-01-01

    Retinoic acid (RA) is known to influence the proliferation and differentiation of a wide variety of transformed and developing cells. We found that RA and the specific RA receptor (RAR) ligand Ch55 inhibited the phorbol ester and calcium ionophore-induced expression of the T-cell growth factor interleukin-2 (IL-2) gene. Expression of transiently transfected chloramphenicol acetyltransferase vectors containing the 5'-flanking region of the IL-2 gene was also inhibited by RA. RA-induced down-regulation of the IL-2 enhancer is mediated by RAR, since overexpression of transfected RARs increased RA sensitivity of the IL-2 promoter. Functional analysis of chloramphenicol acetyltransferase vectors containing either internal deletion mutants of the region from -317 to +47 bp of the IL-2 enhancer or multimerized cis-regulatory elements showed that the RA-responsive element in the IL-2 promoter mapped to sequences containing an octamer motif. RAR also inhibited the transcriptional activity of the octamer motif of the immunoglobulin heavy chain enhancer. In spite of the transcriptional inhibition of the IL-2 octamer motif, RA did not decrease the in vitro DNA-binding capability of octamer-1 protein. These results identify a regulatory pathway within the IL-2 promoter which involves the octamer motif and RAR. Images PMID:1652063

  10. Twine: display and analysis of cis-regulatory modules

    PubMed Central

    Pearson, Joseph C.; Crews, Stephen T.

    2013-01-01

    Summary: Many algorithms analyze enhancers for overrepresentation of known and novel motifs, with the goal of identifying binding sites for direct regulators of gene expression. Twine is a Java GUI with multiple graphical representations (‘Views’) of enhancer alignments that displays motifs, as IUPAC consensus sequences or position frequency matrices, in the context of phylogenetic conservation to facilitate cis-regulatory element discovery. Thresholds of phylogenetic conservation and motif stringency can be altered dynamically to facilitate detailed analysis of enhancer architecture. Views can be exported to vector graphics programs to generate high-quality figures for publication. Twine can be extended via Java plugins to manipulate alignments and analyze sequences. Availability: Twine is freely available as a compiled Java .jar package or Java source code at http://labs.bio.unc.edu/crews/twine/. Contact: steve_crews@unc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23658420

  11. cis-Regulatory control circuits in development.

    PubMed

    Howard, Meredith L; Davidson, Eric H

    2004-07-01

    During development, an organism undergoes many rounds of pattern formation, generating ever-greater complexity with each ensuing round of cell division and specification. The instructions for executing this process are encoded in the cis-regulatory modules that direct the expression of developmental transcription factors and signaling molecules. Each transcription factor binding site within a cis-regulatory module contributes information about when, where, or how much a gene is turned on, and by dissecting the modules driving a given gene, all the inputs governing expression of the gene can be accurately identified. Furthermore, by mapping the output of each gene to the inputs of other genes, it is possible to reverse engineer developmental circuits and even whole networks. At this higher level of organization, common bilaterian strategies for specifying progenitor fields, locking down regulatory states, and driving development forward emerge.

  12. A cis-Regulatory Signature for Chordate Anterior Neuroectodermal Genes

    PubMed Central

    Christiaen, Lionel; Joly, Jean-Stéphane

    2010-01-01

    One of the striking findings of comparative developmental genetics was that expression patterns of core transcription factors are extraordinarily conserved in bilaterians. However, it remains unclear whether cis-regulatory elements of their target genes also exhibit common signatures associated with conserved embryonic fields. To address this question, we focused on genes that are active in the anterior neuroectoderm and non-neural ectoderm of the ascidian Ciona intestinalis. Following the dissection of a prototypic anterior placodal enhancer, we searched all genomic conserved non-coding elements for duplicated motifs around genes showing anterior neuroectodermal expression. Strikingly, we identified an over-represented pentamer motif corresponding to the binding site of the homeodomain protein OTX, which plays a pivotal role in the anterior development of all bilaterian species. Using an in vivo reporter gene assay, we observed that 10 of 23 candidate cis-regulatory elements containing duplicated OTX motifs are active in the anterior neuroectoderm, thus showing that this cis-regulatory signature is predictive of neuroectodermal enhancers. These results show that a common cis-regulatory signature corresponding to K50-Paired homeodomain transcription factors is found in non-coding sequences flanking anterior neuroectodermal genes in chordate embryos. Thus, field-specific selector genes impose architectural constraints in the form of combinations of short tags on their target enhancers. This could account for the strong evolutionary conservation of the regulatory elements controlling field-specific selector genes responsible for body plan formation. PMID:20419150

  13. The Role of cis Regulatory Evolution in Maize Domestication

    PubMed Central

    Lemmon, Zachary H.; Bukowski, Robert; Sun, Qi; Doebley, John F.

    2014-01-01

    Gene expression differences between divergent lineages caused by modification of cis regulatory elements are thought to be important in evolution. We assayed genome-wide cis and trans regulatory differences between maize and its wild progenitor, teosinte, using deep RNA sequencing in F1 hybrid and parent inbred lines for three tissue types (ear, leaf and stem). Pervasive regulatory variation was observed with approximately 70% of ∼17,000 genes showing evidence of regulatory divergence between maize and teosinte. However, many fewer genes (1,079 genes) show consistent cis differences with all sampled maize and teosinte lines. For ∼70% of these 1,079 genes, the cis differences are specific to a single tissue. The number of genes with cis regulatory differences is greatest for ear tissue, which underwent a drastic transformation in form during domestication. As expected from the domestication bottleneck, maize possesses less cis regulatory variation than teosinte with this deficit greatest for genes showing maize-teosinte cis regulatory divergence, suggesting selection on cis regulatory differences during domestication. Consistent with selection on cis regulatory elements, genes with cis effects correlated strongly with genes under positive selection during maize domestication and improvement, while genes with trans regulatory effects did not. We observed a directional bias such that genes with cis differences showed higher expression of the maize allele more often than the teosinte allele, suggesting domestication favored up-regulation of gene expression. Finally, this work documents the cis and trans regulatory changes between maize and teosinte in over 17,000 genes for three tissues. PMID:25375861

  14. Genomic approaches to finding cis-regulatory modules in animals

    PubMed Central

    Hardison, Ross C.; Taylor, James

    2012-01-01

    Differential gene expression is the fundamental mechanism underlying animal development and cell differentiation. However, it is a challenge to identify comprehensively and accurately the DNA sequences required to regulate gene expression, called cis-regulatory modules (CRMs). Three major features (singly or in combination) are used to predict CRMs: clusters of transcription-factor binding-site motifs, noncoding DNA under evolutionary constraint, and biochemical marks associated with CRMs, such as histone modifications and protein occupancy. The validation rates for predictions indicate that identifying diagnostic biochemical marks is the most reliable method, and understanding is enhanced by analysis of motifs and conservation patterns within those predicted CRMs. PMID:22705667

  15. Enhancer divergence and cis-regulatory evolution in the human and chimp neural crest.

    PubMed

    Prescott, Sara L; Srinivasan, Rajini; Marchetto, Maria Carolina; Grishina, Irina; Narvaiza, Iñigo; Selleri, Licia; Gage, Fred H; Swigut, Tomek; Wysocka, Joanna

    2015-09-24

    cis-regulatory changes play a central role in morphological divergence, yet the regulatory principles underlying emergence of human traits remain poorly understood. Here, we use epigenomic profiling from human and chimpanzee cranial neural crest cells to systematically and quantitatively annotate divergence of craniofacial cis-regulatory landscapes. Epigenomic divergence is often attributable to genetic variation within TF motifs at orthologous enhancers, with a novel motif being most predictive of activity biases. We explore properties of this cis-regulatory change, revealing the role of particular retroelements, uncovering broad clusters of species-biased enhancers near genes associated with human facial variation, and demonstrating that cis-regulatory divergence is linked to quantitative expression differences of crucial neural crest regulators. Our work provides a wealth of candidates for future evolutionary studies and demonstrates the value of "cellular anthropology," a strategy of using in-vitro-derived embryonic cell types to elucidate both fundamental and evolving mechanisms underlying morphological variation in higher primates.

  16. Cis-regulatory mutations in human disease

    PubMed Central

    2009-01-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few. PMID:19641089

  17. Cis-regulatory mutations in human disease.

    PubMed

    Epstein, Douglas J

    2009-07-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few.

  18. Computational discovery of soybean promoter cis-regulatory elements for the construction of soybean cyst nematode inducible synthetic promoters

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational methods offer great hope but limited accuracy in the prediction of functional cis-regulatory elements; improvements are needed to enable synthetic promoter design. We applied an ensemble strategy for de novo soybean cyst nematode (SCN)-inducible motif discovery among promoters of 18 co...

  19. Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus

    PubMed Central

    Sundaram, Vasavi; Choudhary, Mayank N. K.; Pehrsson, Erica; Xing, Xiaoyun; Fiore, Christopher; Pandey, Manishi; Maricque, Brett; Udawatta, Methma; Ngo, Duc; Chen, Yujie; Paguntalan, Asia; Ray, Tammy; Hughes, Ava; Cohen, Barak A.; Wang, Ting

    2017-01-01

    Cis-regulatory modules contain multiple transcription factor (TF)-binding sites and integrate the effects of each TF to control gene expression in specific cellular contexts. Transposable elements (TEs) are uniquely equipped to deposit their regulatory sequences across a genome, which could also contain cis-regulatory modules that coordinate the control of multiple genes with the same regulatory logic. We provide the first evidence of mouse-specific TEs that encode a module of TF-binding sites in mouse embryonic stem cells (ESCs). The majority (77%) of the individual TEs tested exhibited enhancer activity in mouse ESCs. By mutating individual TF-binding sites within the TE, we identified a module of TF-binding motifs that cooperatively enhanced gene expression. Interestingly, we also observed the same motif module in the in silico constructed ancestral TE that also acted cooperatively to enhance gene expression. Our results suggest that ancestral TE insertions might have brought in cis-regulatory modules into the mouse genome. PMID:28348391

  20. Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

    PubMed Central

    Xu, Xing; Ji, Yongmei; Stormo, Gary D.

    2009-01-01

    An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA

  1. Validation of skeletal muscle cis-regulatory module predictions reveals nucleotide composition bias in functional enhancers.

    PubMed

    Kwon, Andrew T; Chou, Alice Yi; Arenillas, David J; Wasserman, Wyeth W

    2011-12-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions.

  2. Global identification of the genetic networks and cis-regulatory elements of the cold response in zebrafish

    PubMed Central

    Hu, Peng; Liu, Mingli; Zhang, Dong; Wang, Jinfeng; Niu, Hongbo; Liu, Yimeng; Wu, Zhichao; Han, Bingshe; Zhai, Wanying; Shen, Yu; Chen, Liangbiao

    2015-01-01

    The transcriptional programs of ectothermic teleosts are directly influenced by water temperature. However, the cis- and trans-factors governing cold responses are not well characterized. We profiled transcriptional changes in eight zebrafish tissues exposed to mildly and severely cold temperatures using RNA-Seq. A total of 1943 differentially expressed genes (DEGs) were identified, from which 34 clusters representing distinct tissue and temperature response expression patterns were derived using the k-means fuzzy clustering algorithm. The promoter regions of the clustered DEGs that demonstrated strong co-regulation were analysed for enriched cis-regulatory elements with a motif discovery program, DREME. Seventeen motifs, ten known and seven novel, were identified, which covered 23% of the DEGs. Two motifs predicted to be the binding sites for the transcription factors Bcl6 and Jun, respectively, were chosen for experimental verification, and they demonstrated the expected cold-induced and cold-repressed patterns of gene regulation. Protein interaction modeling of the network components followed by experimental validation suggested that Jun physically interacts with Bcl6 and might be a hub factor that orchestrates the cold response in zebrafish. Thus, the methodology used and the regulatory networks uncovered in this study provide a foundation for exploring the mechanisms of cold adaptation in teleosts. PMID:26227973

  3. A Cis-Regulatory Map of the Drosophila Genome

    PubMed Central

    Nègre, Nicolas; Brown, Christopher D.; Ma, Lijia; Bristow, Christopher Aaron; Miller, Steven W.; Wagner, Ulrich; Kheradpour, Pouya; Eaton, Matthew L.; Loriaux, Paul; Sealfon, Rachel; Li, Zirong; Ishii, Haruhiko; Spokony, Rebecca F.; Chen, Jia; Hwang, Lindsay; Cheng, Chao; Auburn, Richard P.; Davis, Melissa B.; Domanus, Marc; Shah, Parantu K.; Morrison, Carolyn A.; Zieba, Jennifer; Suchy, Sarah; Senderowicz, Lionel; Victorsen, Alec; Bild, Nicholas A.; Grundstad, A. Jason; Hanley, David; MacAlpine, David M.; Mannervik, Mattias; Venken, Koen; Bellen, Hugo; White, Robert; Russell, Steven; Grossman, Robert L.; Ren, Bing; Gerstein, Mark; Posakony, James W.; Kellis, Manolis; White, Kevin P.

    2011-01-01

    Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide 1,2 has successfully identified specific subtypes of regulatory elements 3. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb-Response Elements 4, chromatin states 5, transcription factor binding sites (TFBS) 6–9, PolII regulation 8, and insulator elements 10; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome based on more than 300 chromatin immuno-precipitation (ChIP) datasets for eight chromatin features, five histone deacetylases (HDACs) and thirty-eight site-specific transcription factors (TFs) at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and we validated a subset of predictions for promoters, enhancers, and insulators in vivo. We also identified nearly 2,000 genomic regions of dense TF binding associated with chromatin activity and accessibility. We discovered hundreds of new TF co-binding relationships and defined a TF network with over 800 potential regulatory relationships. PMID:21430782

  4. The molecular signature and cis-regulatory architecture of a C. elegans gustatory neuron

    PubMed Central

    Etchberger, John F.; Lorch, Adam; Sleumer, Monica C.; Zapf, Richard; Jones, Steven J.; Marra, Marco A.; Holt, Robert A.; Moerman, Donald G.; Hobert, Oliver

    2007-01-01

    Taste receptor cells constitute a highly specialized cell type that perceives and conveys specific sensory information to the brain. The detailed molecular composition of these cells and the mechanisms that program their fate are, in general, poorly understood. We have generated serial analysis of gene expression (SAGE) libraries from two distinct populations of single, isolated sensory neuron classes, the gustatory neuron class ASE and the thermosensory neuron class AFD, from the nematode Caenorhabditis elegans. By comparing these two libraries, we have identified >1000 genes that define the ASE gustatory neuron class on a molecular level. This set of genes contains determinants of the differentiated state of the ASE neuron, such as a surprisingly complex repertoire of transcription factors (TFs), ion channels, neurotransmitters, and receptors, as well as seven-transmembrane receptor (7TMR)-type putative gustatory receptor genes. Through the in vivo dissection of the cis-regulatory regions of several ASE-expressed genes, we identified a small cis-regulatory motif, the “ASE motif,” that is required for the expression of many ASE-expressed genes. We demonstrate that the ASE motif is a binding site for the C2H2 zinc finger TF CHE-1, which is essential for the correct differentiation of the ASE gustatory neuron. Taken together, our results provide a unique view of the molecular landscape of a single neuron type and reveal an important aspect of the regulatory logic for gustatory neuron specification in C. elegans. PMID:17606643

  5. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus.

    PubMed

    Garfield, David; Haygood, Ralph; Nielsen, William J; Wray, Gregory A

    2012-01-01

    Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.

  6. Abundant raw material for cis-regulatory evolution in humans.

    PubMed

    Rockman, Matthew V; Wray, Gregory A

    2002-11-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  7. Abundant raw material for cis-regulatory evolution in humans

    NASA Technical Reports Server (NTRS)

    Rockman, Matthew V.; Wray, Gregory A.

    2002-01-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  8. Expression-Guided In Silico Evaluation of Candidate Cis Regulatory Codes for Drosophila Muscle Founder Cells

    PubMed Central

    Gisselbrecht, Stephen S; He, Fangxue Sherry; Estrada, Beatriz; Michelson, Alan M; Bulyk, Martha L

    2006-01-01

    While combinatorial models of transcriptional regulation can be inferred for metazoan systems from a priori biological knowledge, validation requires extensive and time-consuming experimental work. Thus, there is a need for computational methods that can evaluate hypothesized cis regulatory codes before the difficult task of experimental verification is undertaken. We have developed a novel computational framework (termed “CodeFinder”) that integrates transcription factor binding site and gene expression information to evaluate whether a hypothesized transcriptional regulatory model (TRM; i.e., a set of co-regulating transcription factors) is likely to target a given set of co-expressed genes. Our basic approach is to simultaneously predict cis regulatory modules (CRMs) associated with a given gene set and quantify the enrichment for combinatorial subsets of transcription factor binding site motifs comprising the hypothesized TRM within these predicted CRMs. As a model system, we have examined a TRM experimentally demonstrated to drive the expression of two genes in a sub-population of cells in the developing Drosophila mesoderm, the somatic muscle founder cells. This TRM was previously hypothesized to be a general mode of regulation for genes expressed in this cell population. In contrast, the present analyses suggest that a modified form of this cis regulatory code applies to only a subset of founder cell genes, those whose gene expression responds to specific genetic perturbations in a similar manner to the gene on which the original model was based. We have confirmed this hypothesis by experimentally discovering six (out of 12 tested) new CRMs driving expression in the embryonic mesoderm, four of which drive expression in founder cells. PMID:16733548

  9. SMCis: An Effective Algorithm for Discovery of Cis-Regulatory Modules

    PubMed Central

    Guo, Haitao; Huo, Hongwei; Yu, Qiang

    2016-01-01

    The discovery of cis-regulatory modules (CRMs) is a challenging problem in computational biology. Limited by the difficulty of using an HMM to model dependent features in transcriptional regulatory sequences (TRSs), the probabilistic modeling methods based on HMMs cannot accurately represent the distance between regulatory elements in TRSs and are cumbersome to model the prevailing dependencies between motifs within CRMs. We propose a probabilistic modeling algorithm called SMCis, which builds a more powerful CRM discovery model based on a hidden semi-Markov model. Our model characterizes the regulatory structure of CRMs and effectively models dependencies between motifs at a higher level of abstraction based on segments rather than nucleotides. Experimental results on three benchmark datasets indicate that our method performs better than the compared algorithms. PMID:27637070

  10. A primer on regression methods for decoding cis-regulatory logic

    SciTech Connect

    Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.

    2009-03-03

    The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain

  11. Motif-directed redesign of enzyme specificity.

    PubMed

    Borgo, Benjamin; Havranek, James J

    2014-03-01

    Computational protein design relies on several approximations, including the use of fixed backbones and rotamers, to reduce protein design to a computationally tractable problem. However, allowing backbone and off-rotamer flexibility leads to more accurate designs and greater conformational diversity. Exhaustive sampling of this additional conformational space is challenging, and often impossible. Here, we report a computational method that utilizes a preselected library of native interactions to direct backbone flexibility to accommodate placement of these functional contacts. Using these native interaction modules, termed motifs, improves the likelihood that the interaction can be realized, provided that suitable backbone perturbations can be identified. Furthermore, it allows a directed search of the conformational space, reducing the sampling needed to find low energy conformations. We implemented the motif-based design algorithm in Rosetta, and tested the efficacy of this method by redesigning the substrate specificity of methionine aminopeptidase. In summary, native enzymes have evolved to catalyze a wide range of chemical reactions with extraordinary specificity. Computational enzyme design seeks to generate novel chemical activities by altering the target substrates of these existing enzymes. We have implemented a novel approach to redesign the specificity of an enzyme and demonstrated its effectiveness on a model system.

  12. Epistatic Interactions in the Arabinose Cis-Regulatory Element.

    PubMed

    Lagator, Mato; Igler, Claudia; Moreno, Anaísa B; Guet, Călin C; Bollback, Jonathan P

    2016-03-01

    Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner.

  13. Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora

    PubMed Central

    Steige, Kim A.; Laenen, Benjamin; Reimegård, Johan; Slotte, Tanja

    2017-01-01

    Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora. We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species. PMID:28096395

  14. Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora.

    PubMed

    Steige, Kim A; Laenen, Benjamin; Reimegård, Johan; Scofield, Douglas G; Slotte, Tanja

    2017-01-31

    Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species.

  15. Evolution of lineage-specific functions in ancient cis-regulatory modules.

    PubMed

    Pauls, Stefan; Goode, Debbie K; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-11-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs.

  16. Evolution of lineage-specific functions in ancient cis-regulatory modules

    PubMed Central

    Pauls, Stefan; Goode, Debbie K.; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-01-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs. PMID:26538567

  17. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  18. Distant cis Regulatory Elements in Human Skeletal Muscle Differentiation

    PubMed Central

    McCord, Rachel Patton; Zhou, Vicky W.; Yuh, Tiffany; Bulyk, Martha L.

    2011-01-01

    Identifying gene regulatory elements and their target genes in human cells remains a significant challenge. Despite increasing evidence of physical interactions between distant regulatory elements and gene promoters in mammalian cells, many studies consider only promoter-proximal regulatory regions. We identify putative cis-regulatory modules (CRMs) in human skeletal muscle differentiation by combining myogenic TF binding data before and after differentiation with histone modification data in myoblasts. CRMs that are distant (>20 kb) from muscle gene promoters are common and are more likely than proximal promoter regions to show differentiation-specific changes in myogenic TF binding. We find that two of these distant CRMs, known to activate transcription in differentiating myoblasts, interact physically with gene promoters (PDLIM3 and ACTA1) during differentiation. Our results highlight the importance of considering distal CRMs in investigations of mammalian gene regulation and support the hypothesis that distant CRM-promoter looping contacts are a general mechanism of gene regulation. PMID:21907276

  19. TFM-Explorer: mining cis-regulatory regions in genomes

    PubMed Central

    Tonon, Laurie; Varré, Jean-Stéphane

    2010-01-01

    DNA-binding transcription factors (TFs) play a central role in transcription regulation, and computational approaches that help in elucidating complex mechanisms governing this basic biological process are of great use. In this perspective, we present the TFM-Explorer web server that is a toolbox to identify putative TF binding sites within a set of upstream regulatory sequences of genes sharing some regulatory mechanisms. TFM-Explorer finds local regions showing overrepresentation of binding sites. Accepted organisms are human, mouse, rat, chicken and drosophila. The server employs a number of features to help users to analyze their data: visualization of selected binding sites on genomic sequences, and selection of cis-regulatory modules. TFM-Explorer is available at http://bioinfo.lifl.fr/TFM. PMID:20522509

  20. Uncovering cis Regulatory Codes Using Synthetic Promoter Shuffling

    PubMed Central

    Kinkhabwala, Ali; Guet, Călin C.

    2008-01-01

    Revealing the spectrum of combinatorial regulation of transcription at individual promoters is essential for understanding the complex structure of biological networks. However, the computations represented by the integration of various molecular signals at complex promoters are difficult to decipher in the absence of simple cis regulatory codes. Here we synthetically shuffle the regulatory architecture — operator sequences binding activators and repressors — of a canonical bacterial promoter. The resulting library of complex promoters allows for rapid exploration of promoter encoded logic regulation. Among all possible logic functions, NOR and ANDN promoter encoded logics predominate. A simple transcriptional cis regulatory code determines both logics, establishing a straightforward map between promoter structure and logic phenotype. The regulatory code is determined solely by the type of transcriptional regulation combinations: two repressors generate a NOR: NOT (a OR b) whereas a repressor and an activator generate an ANDN: a AND NOT b. Three-input versions of both logics, having an additional repressor as an input, are also present in the library. The resulting complex promoters cover a wide dynamic range of transcriptional strengths. Synthetic promoter shuffling represents a fast and efficient method for exploring the spectrum of complex regulatory functions that can be encoded by complex promoters. From an engineering point of view, synthetic promoter shuffling enables the experimental testing of the functional properties of complex promoters that cannot necessarily be inferred ab initio from the known properties of the individual genetic components. Synthetic promoter shuffling may provide a useful experimental tool for studying naturally occurring promoter shuffling. PMID:18446205

  1. Identifying cis-regulatory changes involved in the evolution of aerobic fermentation in yeasts.

    PubMed

    Lin, Zhenguo; Wang, Tzi-Yuan; Tsai, Bing-Shi; Wu, Fang-Ting; Yu, Fu-Jung; Tseng, Yu-Jung; Sung, Huang-Mo; Li, Wen-Hsiung

    2013-01-01

    Gene regulation change has long been recognized as an important mechanism for phenotypic evolution. We used the evolution of yeast aerobic fermentation as a model to explore how gene regulation has evolved and how this process has contributed to phenotypic evolution and adaptation. Most eukaryotes fully oxidize glucose to CO2 and H2O in mitochondria to maximize energy yield, whereas some yeasts, such as Saccharomyces cerevisiae and its relatives, predominantly ferment glucose into ethanol even in the presence of oxygen, a phenomenon known as aerobic fermentation. We examined the genome-wide gene expression levels among 12 different yeasts and found that a group of genes involved in the mitochondrial respiration process showed the largest reduction in gene expression level during the evolution of aerobic fermentation. Our analysis revealed that the downregulation of these genes was significantly associated with massive loss of binding motifs of Cbf1p in the fermentative yeasts. Our experimental assays confirmed the binding of Cbf1p to the predicted motif and the activator role of Cbf1p. In summary, our study laid a foundation to unravel the long-time mystery about the genetic basis of evolution of aerobic fermentation, providing new insights into understanding the role of cis-regulatory changes in phenotypic evolution.

  2. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  3. Putative cis-regulatory elements in genes highly expressed in rice sperm cells

    PubMed Central

    2011-01-01

    Background The male germ line in flowering plants is initiated within developing pollen grains via asymmetric division. The smaller cell then becomes totally encased within a much larger vegetative cell, forming a unique "cell within a cell structure". The generative cell subsequently divides to give rise to two non-motile diminutive sperm cells, which take part in double fertilization and lead to the seed set. Sperm cells are difficult to investigate because of their presence within the confines of the larger vegetative cell. However, recently developed techniques for the isolation of rice sperm cells and the fully annotated rice genome sequence have allowed for the characterization of the transcriptional repertoire of sperm cells. Microarray gene expression data has identified a subset of rice genes that show unique or highly preferential expression in sperm cells. This information has led to the identification of cis-regulatory elements (CREs), which are conserved in sperm-expressed genes and are putatively associated with the control of cell-specific expression. Findings We aimed to identify the CREs associated with rice sperm cell-specific gene expression data using in silico prediction tools. We analyzed 1-kb upstream regions of the top 40 sperm cell co-expressed genes for over-represented conserved and novel motifs. Analysis of upstream regions with the SIGNALSCAN program with the PLACE database, MEME and the Mclip tool helped to find combinatorial sets of known transcriptional factor-binding sites along with two novel motifs putatively associated with the co-expression of sperm cell-specific genes. Conclusions Our data shows the occurrence of novel motifs, which are putative CREs and are likely targets of transcriptional factors regulating sperm cell gene expression. These motifs can be used to design the experimental verification of regulatory elements and the identification of transcriptional factors that regulate sperm cell-specific gene expression. PMID

  4. Detailed map of a cis-regulatory input function

    NASA Astrophysics Data System (ADS)

    Setty, Y.; Mayo, A. E.; Surette, M. G.; Alon, U.

    2003-06-01

    Most genes are regulated by multiple transcription factors that bind specific sites in DNA regulatory regions. These cis-regulatory regions perform a computation: the rate of transcription is a function of the active concentrations of each of the input transcription factors. Here, we used accurate gene expression measurements from living cell cultures, bearing GFP reporters, to map in detail the input function of the classic lacZYA operon of Escherichia coli, as a function of about a hundred combinations of its two inducers, cAMP and isopropyl -D-thiogalactoside (IPTG). We found an unexpectedly intricate function with four plateau levels and four thresholds. This result compares well with a mathematical model of the binding of the regulatory proteins cAMP receptor protein (CRP) and LacI to the lac regulatory region. The model is also used to demonstrate that with few mutations, the same region could encode much purer AND-like or even OR-like functions. This possibility means that the wild-type region is selected to perform an elaborate computation in setting the transcription rate. The present approach can be generally used to map the input functions of other genes.

  5. Assessing Computational Methods of Cis-Regulatory Module Prediction

    PubMed Central

    Su, Jing; Teichmann, Sarah A.; Down, Thomas A.

    2010-01-01

    Computational methods attempting to identify instances of cis-regulatory modules (CRMs) in the genome face a challenging problem of searching for potentially interacting transcription factor binding sites while knowledge of the specific interactions involved remains limited. Without a comprehensive comparison of their performance, the reliability and accuracy of these tools remains unclear. Faced with a large number of different tools that address this problem, we summarized and categorized them based on search strategy and input data requirements. Twelve representative methods were chosen and applied to predict CRMs from the Drosophila CRM database REDfly, and across the human ENCODE regions. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. When discriminating CRMs from non-coding regions, those methods considering evolutionary conservation have a stronger predictive power than methods designed to be run on a single genome. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. For example, some favour homotypical clusters of binding sites, while others perform best on short CRMs. Furthermore, most methods appear to be sensitive to the composition and structure of the genome to which they are applied. We analyze the principal features that distinguish the methods that performed well, identify weaknesses leading to poor performance, and provide a guide for users. We also propose key considerations for the development and evaluation of future CRM-prediction methods. PMID:21152003

  6. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

    PubMed Central

    Kuntz, Steven G.; Schwarz, Erich M.; DeModena, John A.; De Buysscher, Tristan; Trout, Diane; Shizuya, Hiroaki; Sternberg, Paul W.; Wold, Barbara J.

    2008-01-01

    To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced ∼0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide. PMID:18981268

  7. Pitx1 Broadly Associates with Limb Enhancers and is Enriched on Hindlimb cis-Regulatory Elements

    PubMed Central

    Infante, Carlos R.; Park, Sungdae; Mihala, Alexandra; Kingsley, David M.; Menke, Douglas B.

    2013-01-01

    Extensive functional analyses have demonstrated that the pituitary homeodomain transcription factor Pitx1 plays a critical role in specifying hindlimb morphology in vertebrates. However, much less is known regarding the target genes and cis-regulatory elements through which Pitx1 acts. Earlier studies suggested that the hindlimb transcription factors Tbx4, HoxC10, and HoxC11 might be transcriptional targets of Pitx1, but definitive evidence for direct regulatory interactions has been lacking. Using ChIP-Seq on embryonic mouse hindlimbs, we have pinpointed the genome-wide location of Pitx1 binding sites during mouse hindlimb development and identified potential gene targets for Pitx1. We determined that Pitx1 binding is significantly enriched near genes involved in limb morphogenesis, including Tbx4, HoxC10, and HoxC11. Notably, Pitx1 is bound to the previously identified HLEA and HLEB hindlimb enhancers of the Tbx4 gene and to a newly identified Tbx2 hindlimb enhancer. Moreover, Pitx1 binding is significantly enriched on hindlimb relative to forelimb-specific cis-regulatory features that are differentially marked by H3K27ac. However, our analysis revealed that Pitx1 also strongly associates with many functionally verified limb enhancers that exhibit similar levels of activity in the embryonic mesenchyme of forelimbs and hindlimbs. We speculate that Pitx1 influences hindlimb morphology both through the activation of hindlimb specific enhancers as well as through the hindlimb-specific modulation of enhancers that are active in both sets of limbs. PMID:23201014

  8. Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells.

    PubMed

    Freire-Pritchett, Paula; Schoenfelder, Stefan; Várnai, Csilla; Wingett, Steven W; Cairns, Jonathan; Collier, Amanda J; García-Vílchez, Raquel; Furlan-Magaril, Mayra; Osborne, Cameron S; Fraser, Peter J; Rugg-Gunn, Peter J; Spivakov, Mikhail

    2017-03-23

    Long-range cis-regulatory elements such as enhancers coordinate cell-specific transcriptional programmes by engaging in DNA looping interactions with target promoters. Deciphering the interplay between the promoter connectivity and activity of cis-regulatory elements during lineage commitment is crucial for understanding developmental transcriptional control. Here, we use Promoter Capture Hi-C to generate a high-resolution atlas of chromosomal interactions involving ~22,000 gene promoters in human pluripotent and lineage-committed cells, identifying putative target genes for known and predicted enhancer elements. We reveal extensive dynamics of cis-regulatory contacts upon lineage commitment, including the acquisition and loss of promoter interactions. This spatial rewiring occurs preferentially with predicted changes in the activity of cis-regulatory elements, and is associated with changes in target gene expression. Our results provide a global and integrated view of promoter interactome dynamics during lineage commitment of human pluripotent cells.

  9. Identification of Cis-regulatory elements in the mouse Pax9/Nkx2-9 genomic region: implication for evolutionary conserved synteny.

    PubMed Central

    Santagati, Fabio; Abe, Kuniya; Schmidt, Volker; Schmitt-John, Thomas; Suzuki, Misao; Yamamura, Ken-Ichi; Imai, Kenji

    2003-01-01

    We previously reported close physical linkage between Pax9 and Nkx2-9 in the human, mouse, and pufferfish (Fugu rubripes) genomes. In this study, we analyzed cis-regulatory elements of the two genes by comparative sequencing in the three species and by transgenesis in the mouse. We identified two regions including conserved noncoding sequences that possessed specific enhancer activities for expression of Pax9 in the medial nasal process and of Nkx2-9 in the ventral neural tube. Remarkably, the latter contained the consensus Gli-binding motif. Interestingly, the identified Pax9 cis-regulatory sequences were located in an intron of the neighboring gene Slc25a21. Close examination of an extended genomic interval around Pax9 revealed the presence of strong synteny conservation in the human, mouse, and Fugu genomes. We propose such an intersecting organization of cis-regulatory sequences in multigenic regions as a possible mechanism that maintains evolutionary conserved synteny. PMID:14504231

  10. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  11. The search for cis-regulatory driver mutations in cancer genomes.

    PubMed

    Poulos, Rebecca C; Sloane, Mathew A; Hesson, Luke B; Wong, Jason W H

    2015-10-20

    With the advent of high-throughput and relatively inexpensive whole-genome sequencing technology, the focus of cancer research has begun to shift toward analyses of somatic mutations in non-coding cis-regulatory elements of the cancer genome. Cis-regulatory elements play an important role in gene regulation, with mutations in these elements potentially resulting in changes to the expression of linked genes. The recent discoveries of recurrent TERT promoter mutations in melanoma, and recurrent mutations that create a super-enhancer regulating TAL1 expression in T-cell acute lymphoblastic leukaemia (T-ALL), have sparked significant interest in the search for other somatic cis-regulatory mutations driving cancer development. In this review, we look more closely at the TERT promoter and TAL1 enhancer alterations and use these examples to ask whether other cis-regulatory mutations may play a role in cancer susceptibility. In doing so, we make observations from the data emerging from recent research in this field, and describe the experimental and analytical approaches which could be adopted in the hope of better uncovering the true functional significance of somatic cis-regulatory mutations in cancer.

  12. Identification of distal cis-regulatory elements at mouse mitoferrin loci using zebrafish transgenesis.

    PubMed

    Amigo, Julio D; Yu, Ming; Troadec, Marie-Berengere; Gwynn, Babette; Cooney, Jeffrey D; Lambert, Amy J; Chi, Neil C; Weiss, Mitchell J; Peters, Luanne L; Kaplan, Jerry; Cantor, Alan B; Paw, Barry H

    2011-04-01

    Mitoferrin 1 (Mfrn1; Slc25a37) and mitoferrin 2 (Mfrn2; Slc25a28) function as essential mitochondrial iron importers for heme and Fe/S cluster biogenesis. A genetic deficiency of Mfrn1 results in a profound hypochromic anemia in vertebrate species. To map the cis-regulatory modules (CRMs) that control expression of the Mfrn genes, we utilized genome-wide chromatin immunoprecipitation (ChIP) datasets for the major erythroid transcription factor GATA-1. We identified the CRMs that faithfully drive the expression of Mfrn1 during blood and heart development and Mfrn2 ubiquitously. Through in vivo analyses of the Mfrn-CRMs in zebrafish and mouse, we demonstrate their functional and evolutionary conservation. Using knockdowns with morpholinos and cell sorting analysis in transgenic zebrafish embryos, we show that GATA-1 directly regulates the expression of Mfrn1. Mutagenesis of individual GATA-1 binding cis elements (GBE) demonstrated that at least two of the three GBE within this CRM are functionally required for GATA-mediated transcription of Mfrn1. Furthermore, ChIP assays demonstrate switching from GATA-2 to GATA-1 at these elements during erythroid maturation. Our results provide new insights into the genetic regulation of mitochondrial function and iron homeostasis and, more generally, illustrate the utility of genome-wide ChIP analysis combined with zebrafish transgenesis for identifying long-range transcriptional enhancers that regulate tissue development.

  13. Recurrent modification of a conserved cis-regulatory element underlies fruit fly pigmentation diversity.

    PubMed

    Rogers, William A; Salomone, Joseph R; Tacy, David J; Camino, Eric M; Davis, Kristen A; Rebeiz, Mark; Williams, Thomas M

    2013-08-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages.

  14. Cis-regulatory analysis of the sea urchin pigment cell gene polyketide synthase.

    PubMed

    Calestani, Cristina; Rogers, David J

    2010-04-15

    The Strongylocentrotus purpuratus polyketide synthase gene (SpPks) encodes an enzyme required for the biosynthesis of the larval pigment echinochrome. SpPks is expressed exclusively in pigment cells and their precursors starting at blastula stage. The 7th-9th cleavage Delta-Notch signaling, required for pigment cell development, positively regulates SpPks. In previous studies, the transcription factors glial cell missing (SpGcm), SpGatae and kruppel-like (SpKrl/z13) have been shown to positively regulate SpPks. To uncover the structure of the Gene Regulatory Network (GRN) regulating the specification and differentiation processes of pigment cells, we experimentally analyzed the putative SpPks cis-regulatory region. We established that the -1.5kb region is sufficient to recapitulate the correct spatial and temporal expression of SpPks. Predicted DNA-binding sites for SpGcm, SpGataE and SpKrl are located within this region. The mutagenesis of these DNA-binding sites indicated that SpGcm, SpGataE and SpKrl are direct positive regulators of SpPks. These results demonstrate that the sea urchin GRN for pigment cell development is quite shallow, which is typical of type I embryo development.

  15. cis-Regulatory Mutations Are a Genetic Cause of Human Limb Malformations

    PubMed Central

    VanderMeer, Julia E.; Ahituv, Nadav

    2011-01-01

    The underlying mutations that cause human limb malformations are often difficult to determine, particularly for limb malformations that occur as isolated traits. Evidence from a variety of studies shows that cis-regulatory mutations, specifically in enhancers, can lead to some of these isolated limb malformations. Here, we provide a review of human limb malformations that have been shown to be caused by enhancer mutations and propose that cis-regulatory mutations will continue to be identified as the cause of additional human malformations as our understanding of regulatory sequences improves. PMID:21509892

  16. Motif-role-fingerprints: the building-blocks of motifs, clustering-coefficients and transitivities in directed networks.

    PubMed

    McDonnell, Mark D; Yaveroğlu, Ömer Nebil; Schmerl, Brett A; Iannella, Nicolangelo; Ward, Lawrence M

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are 'structural' (induced subgraphs) and 'functional' (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.

  17. Motif-Role-Fingerprints: The Building-Blocks of Motifs, Clustering-Coefficients and Transitivities in Directed Networks

    PubMed Central

    McDonnell, Mark D.; Yaveroğlu, Ömer Nebil; Schmerl, Brett A.; Iannella, Nicolangelo; Ward, Lawrence M.

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File. PMID:25486535

  18. Cis-regulatory mechanisms governing stem and progenitor cell transitions

    PubMed Central

    Johnson, Kirby D.; Kong, Guangyao; Gao, Xin; Chang, Yuan-I; Hewitt, Kyle J.; Sanalkumar, Rajendran; Prathibha, Rajalekshmi; Ranheim, Erik A.; Dewey, Colin N.; Zhang, Jing; Bresnick, Emery H.

    2015-01-01

    Cis-element encyclopedias provide information on phenotypic diversity and disease mechanisms. Although cis-element polymorphisms and mutations are instructive, deciphering function remains challenging. Mutation of an intronic GATA motif (+9.5) in GATA2, encoding a master regulator of hematopoiesis, underlies an immunodeficiency associated with myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). Whereas an inversion relocalizes another GATA2 cis-element (−77) to the proto-oncogene EVI1, inducing EVI1 expression and AML, whether this reflects ectopic or physiological activity is unknown. We describe a mouse strain that decouples −77 function from proto-oncogene deregulation. The −77−/− mice exhibited a novel phenotypic constellation including late embryonic lethality and anemia. The −77 established a vital sector of the myeloid progenitor transcriptome, conferring multipotentiality. Unlike the +9.5−/− embryos, hematopoietic stem cell genesis was unaffected in −77−/− embryos. These results illustrate a paradigm in which cis-elements in a locus differentially control stem and progenitor cell transitions, and therefore the individual cis-element alterations cause unique and overlapping disease phenotypes. PMID:26601269

  19. Characterization of the cis-regulatory region of the Drosophila homeotic gene Sex combs reduced

    SciTech Connect

    Gindhart, J.G. Jr.; King, N.A.; Kaufman, T.C.

    1995-02-01

    The Drosophilia homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. This analysis provides an entry point for the study of how Scr transcription is regulated at the molecular level. 60 refs., 7 figs., 1 tab.

  20. Complex patterns of cis-regulatory polymorphisms in ebony underlie standing pigmentation variation in Drosophila melanogaster.

    PubMed

    Miyagi, Ryutaro; Akiyama, Noriyoshi; Osada, Naoki; Takahashi, Aya

    2015-12-01

    Pigmentation traits in adult Drosophila melanogaster were used in this study to investigate how phenotypic variations in continuous ecological traits can be maintained in a natural population. First, pigmentation variation in the adult female was measured at seven different body positions in 20 strains from the Drosophila melanogaster Genetic Reference Panel (DGRP) originating from a natural population in North Carolina. Next, to assess the contributions of cis-regulatory polymorphisms of the genes involved in the melanin biosynthesis pathway, allele-specific expression levels of four genes were quantified by amplicon sequencing using a 454 GS Junior. Among those genes, ebony was significantly associated with pigmentation intensity of the thoracic segment. Detailed sequence analysis of the gene regulatory regions of this gene indicated that many different functional cis-regulatory alleles are segregating in the population and that variations outside the core enhancer element could potentially play important roles in the regulation of gene expression. In addition, a slight enrichment of distantly associated SNP pairs was observed in the ~10 kb cis-regulatory region of ebony, which suggested the presence of interacting elements scattered across the region. In contrast, sequence analysis in the core cis-regulatory region of tan indicated that SNPs within the region are significantly associated with allele-specific expression level of this gene. Collectively, the data suggest that the underlying genetic differences in the cis-regulatory regions that control intraspecific pigmentation variation can be more complex than those of interspecific pigmentation trait differences, where causal genetic changes are typically confined to modular enhancer elements.

  1. No Excess of Cis-Regulatory Variation Associated with Intraspecific Selection in Wild Pearl Millet (Cenchrus americanus)

    PubMed Central

    Rhoné, Bénédicte; Mariac, Cédric; Couderc, Marie; Berthouly-Salazar, Cécile; Ousseini, Issaka Salia

    2017-01-01

    Several studies suggest that cis-regulatory mutations are the favorite target of evolutionary changes, one reason being that cis-regulatory mutations might have fewer deleterious pleiotropic effects than protein-coding mutations. A review of the process also suggests that this bias towards adaptive cis-regulatory variation might be less pronounced at the intraspecific level compared with the interspecific level. In this study, we assessed the contribution of cis-regulatory variation to adaptation at the intraspecific level using populations of wild pearl millet (Cenchrus americanus ssp. monodii) sampled along an environmental gradient in Niger. From RNA sequencing of hybrids to assess allele-specific expression, we identified genes with cis-regulatory divergence between two parental accessions collected in contrasted environmental conditions. This revealed that ∼15% of transcribed genes showed cis-regulatory variation. Intersecting the gene set exhibiting cis-regulatory variation with the gene set identified as targets of selection revealed no excess of cis-acting mutations among the selected genes. We additionally found no excess of cis-regulatory variation among genes associated with adaptive traits. As our approach relied on methods identifying mainly genes submitted to strong selection pressure or with high phenotypic effect, the contribution of cis-regulatory changes to soft selection or polygenic adaptive traits remains to be tested. However our results favor the hypothesis that enrichment of adaptive cis-regulatory divergence builds up over time. For short evolutionary time-scales, cis-acting mutations are not predominantly involved in adaptive evolution associated with strong selective signal. PMID:28137746

  2. Recurrent Modification of a Conserved Cis-Regulatory Element Underlies Fruit Fly Pigmentation Diversity

    PubMed Central

    Rogers, William A.; Salomone, Joseph R.; Tacy, David J.; Camino, Eric M.; Davis, Kristen A.; Rebeiz, Mark; Williams, Thomas M.

    2013-01-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages. PMID:24009528

  3. Computational identification of new structured cis-regulatory elements in the 3'-untranslated region of human protein coding genes.

    PubMed

    Chen, Xiaowei Sylvia; Brown, Chris M

    2012-10-01

    Messenger ribonucleic acids (RNAs) contain a large number of cis-regulatory RNA elements that function in many types of post-transcriptional regulation. These cis-regulatory elements are often characterized by conserved structures and/or sequences. Although some classes are well known, given the wide range of RNA-interacting proteins in eukaryotes, it is likely that many new classes of cis-regulatory elements are yet to be discovered. An approach to this is to use computational methods that have the advantage of analysing genomic data, particularly comparative data on a large scale. In this study, a set of structural discovery algorithms was applied followed by support vector machine (SVM) classification. We trained a new classification model (CisRNA-SVM) on a set of known structured cis-regulatory elements from 3'-untranslated regions (UTRs) and successfully distinguished these and groups of cis-regulatory elements not been strained on from control genomic and shuffled sequences. The new method outperformed previous methods in classification of cis-regulatory RNA elements. This model was then used to predict new elements from cross-species conserved regions of human 3'-UTRs. Clustering of these elements identified new classes of potential cis-regulatory elements. The model, training and testing sets and novel human predictions are available at: http://mRNA.otago.ac.nz/CisRNA-SVM.

  4. Cis-Regulatory Timers for Developmental Gene Expression

    PubMed Central

    Christiaen, Lionel

    2013-01-01

    How does a fertilized egg decode its own genome to eventually develop into a mature animal? Each developing cell must activate a battery of genes in a timely manner and according to the function it will ultimately perform, but how? During development of the notochord—a structure akin to the vertebrate spine—in a simple marine invertebrate, an essential protein called Brachyury binds to specific sites in its target genes. A study just published in PLOS Biology reports that if the target gene contains multiple Brachyury-binding sites it will be activated early in development but if it contains only one site it will be activated later. Genes that contain no binding site can still be activated by Brachyury, but only indirectly by an earlier Brachyury-dependent gene product, so later than the directly activated genes. Thus, this study shows how several genes can interpret the presence of a single factor differently to become active at distinct times in development. PMID:24204213

  5. MyoD reprogramming requires Six1 and Six4 homeoproteins: genome-wide cis-regulatory module analysis

    PubMed Central

    Santolini, Marc; Sakakibara, Iori; Gauthier, Morgane; Ribas-Aulinas, Francesc; Takahashi, Hirotaka; Sawasaki, Tatsuya; Mouly, Vincent; Concordet, Jean-Paul; Defossez, Pierre-Antoine; Hakim, Vincent; Maire, Pascal

    2016-01-01

    Myogenic regulatory factors of the MyoD family have the ability to reprogram differentiated cells toward a myogenic fate. In this study, we demonstrate that Six1 or Six4 are required for the reprogramming by MyoD of mouse embryonic fibroblasts (MEFs). Using microarray experiments, we found 761 genes under the control of both Six and MyoD. Using MyoD ChIPseq data and a genome-wide search for Six1/4 MEF3 binding sites, we found significant co-localization of binding sites for MyoD and Six proteins on over a thousand mouse genomic DNA regions. The combination of both datasets yielded 82 genes which are synergistically activated by Six and MyoD, with 96 associated MyoD+MEF3 putative cis-regulatory modules (CRMs). Fourteen out of 19 of the CRMs that we tested demonstrated in Luciferase assays a synergistic action also observed for their cognate gene. We searched putative binding sites on these CRMs using available databases and de novo search of conserved motifs and demonstrated that the Six/MyoD synergistic activation takes place in a feedforward way. It involves the recruitment of these two families of transcription factors to their targets, together with partner transcription factors, encoded by genes that are themselves activated by Six and MyoD, including Mef2, Pbx-Meis and EBF. PMID:27302134

  6. The identification of cis-regulatory elements: A review from a machine learning perspective.

    PubMed

    Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

    2015-12-01

    The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field.

  7. Predominant contribution of cis-regulatory divergence in the evolution of mouse alternative splicing

    PubMed Central

    Gao, Qingsong; Sun, Wei; Ballegeer, Marlies; Libert, Claude; Chen, Wei

    2015-01-01

    Divergence of alternative splicing represents one of the major driving forces to shape phenotypic diversity during evolution. However, the extent to which these divergences could be explained by the evolving cis-regulatory versus trans-acting factors remains unresolved. To globally investigate the relative contributions of the two factors for the first time in mammals, we measured splicing difference between C57BL/6J and SPRET/EiJ mouse strains and allele-specific splicing pattern in their F1 hybrid. Out of 11,818 alternative splicing events expressed in the cultured fibroblast cells, we identified 796 with significant difference between the parental strains. After integrating allele-specific data from F1 hybrid, we demonstrated that these events could be predominately attributed to cis-regulatory variants, including those residing at and beyond canonical splicing sites. Contrary to previous observations in Drosophila, such predominant contribution was consistently observed across different types of alternative splicing. Further analysis of liver tissues from the same mouse strains and reanalysis of published datasets on other strains showed similar trends, implying in general the predominant contribution of cis-regulatory changes in the evolution of mouse alternative splicing. PMID:26134616

  8. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation.

    PubMed

    O'Connor, Timothy R; Bailey, Timothy L

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules-CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for 'other' tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a 'nearest neighbor' heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps.

  9. Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element

    SciTech Connect

    Mao, Grace; Brody, James P.

    2007-11-09

    Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16 bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16 bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014 s{sup -1}. We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase.

  10. Allelic imbalance identifies novel tissue specific cis-regulatory variation for human UGT2B15

    PubMed Central

    Sun, Chang; Southard, Catherine; Witonsky, David B.; Olopade, Olufunmilayo I.; Di Rienzo, Anna

    2010-01-01

    Allelic imbalance (AI) is a powerful tool to identify cis-regulatory variation for gene expression. UGT2B15 is an important enzyme involved in the metabolism of multiple endobiotics and xenobiotics. In this study, we measured the relative expression of two alleles at this gene by using SNP rs1902023:G>T. An excess of the G over the T allele was consistently observed in liver (P<0.001), but not in breast (P=0.06) samples, suggesting that SNPs in strong linkage disequilibrium with G253T regulate UGT2B15 expression in liver. Seven such SNPs were identified by resequencing the promoter and exon 1, which define two distinct haplotypes. Reporter gene assays confirmed that one haplotype displayed ~20% higher promoter activity compared to the other major haplotype in liver HepG2 (P<0.001), but not in breast MCF-7 (P=0.540) cells. Reporter gene assays with additional constructs pointed to rs34010522:G>T and rs35513228:C>T as the cis-regulatory variants; both SNPs were also evaluated in LNCaP and Caco-2 cells. By ChIP, we showed that the transcription factor Nrf2 binds to the region spanning rs34010522:G>T in all four cell lines. Our results provide a good example for how AI can be used to identify cis-regulatory variation and gain insights into the tissue specific regulation of gene expression. PMID:19847790

  11. Characterization of the Cis-Regulatory Region of the Drosophila Homeotic Gene Sex Combs Reduced

    PubMed Central

    Gindhart-Jr., J. G.; King, A. N.; Kaufman, T. C.

    1995-01-01

    The Drosophila homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. Interestingly, regulatory sequences that direct reporter gene expression in an Scr-like pattern in the anterior and posterior midgut are imbedded in the regulatory region of the segmentation gene fushi tarazu (ftz), which is normally located

  12. Expression, subcellular localization, and cis-regulatory structure of duplicated phytoene synthase genes in melon (Cucumis melo L.).

    PubMed

    Qin, Xiaoqiong; Coku, Ardian; Inoue, Kentaro; Tian, Li

    2011-10-01

    Carotenoids perform many critical functions in plants, animals, and humans. It is therefore important to understand carotenoid biosynthesis and its regulation in plants. Phytoene synthase (PSY) catalyzes the first committed and rate-limiting step in carotenoid biosynthesis. While PSY is present as a single copy gene in Arabidopsis, duplicated PSY genes have been identified in many economically important monocot and dicot crops. CmPSY1 was previously identified from melon (Cucumis melo L.), but was not functionally characterized. We isolated a second PSY gene, CmPSY2, from melon in this work. CmPSY2 possesses a unique intron/exon structure that has not been observed in other plant PSYs. Both CmPSY1 and CmPSY2 are functional in vitro, but exhibit distinct expression patterns in different melon tissues and during fruit development, suggesting differential regulation of the duplicated melon PSY genes. In vitro chloroplast import assays verified the plastidic localization of CmPSY1 and CmPSY2 despite the lack of an obvious plastid target peptide in CmPSY2. Promoter motif analysis of the duplicated melon and tomato PSY genes and the Arabidopsis PSY revealed distinctive cis-regulatory structures of melon PSYs and identified gibberellin-responsive motifs in all PSYs except for SlPSY1, which has not been reported previously. Overall, these data provide new insights into the evolutionary history of plant PSY genes and the regulation of PSY expression by developmental and environmental signals that may involve different regulatory networks.

  13. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  14. Quantitative functional interrelations within the cis-regulatory system of the S. purpuratus Endo16 gene.

    PubMed

    Yuh, C H; Moore, J G; Davidson, E H

    1996-12-01

    Embryonic expression of the Endo16 gene of Strongylocentrotus purpuratus is controlled by interactions with at least 13 different DNA-binding factors. These interactions occur within a cis-regulatory domain that extends about 2300 bp upstream from the transcription start site. A recent functional characterization of this domain reveals six different subregions, or cis-regulatory modules, each of which displays a specific regulatory subfunction when linked with the basal promoter and in some cases various other modules (C.-H. Yuh and E. Davidson (1996) Development 122, 1069-1082). In the present work, we analyzed quantitative time-course measurements of the CAT enzyme output of embryos bearing expression constructs controlled by various Endo16 regulatory modules, either singly or in combination. Three of these modules function positively in that, in isolation, each is capable of promoting expression in vegetal plate and adjacent cell lineages, though with different temporal profiles of activity. Models for the mode of interaction of the three positive modules with one another were tested by assuming mathematical relations that would generate, from the measured single module time courses, the experimentally observed profiles of activity obtained when the relevant modules are physically linked in the same construct. The generated and observed time functions were compared, and the differences were minimized by least squares adjustment of a scale parameter. When the modules were tested in context of the endogenous promoter region, one of the positive modules (A) was found to increase the output of the others (B and G), by a constant factor. In contrast, a solution in which the time-course data of modules A and B are multiplied by one another was required for the interrelations of the positive modules when a minimal SV40 promoter was used. One interpretation is that, in this construct, each module independently stimulates the basal transcription complex. We used a

  15. Relocation Facilitates the Acquisition of Short Cis-Regulatory Regions that Drive the Expression of Retrogenes during Spermatogenesis in Drosophila

    PubMed Central

    Sorourian, Mehran; Kunte, Mansi M.; Domingues, Susana; Gallach, Miguel; Özdil, Fulya; Río, Javier; Betrán, Esther

    2014-01-01

    Retrogenes are functional processed copies of genes that originate via the retrotranscription of an mRNA intermediate and often exhibit testis-specific expression. Although this expression pattern appears to be favored by selection, the origin of such expression bias remains unexplained. Here, we study the regulation of two young testis-specific Drosophila retrogenes, Dntf-2r and Pros28.1A, using genetic transformation and the enhanced green fluorescent protein reporter gene in Drosophila melanogaster. We show that two different short (<24 bp) regions upstream of the transcription start sites (TSSs) act as testis-specific regulatory motifs in these genes. The Dntf-2r regulatory region is similar to the known β2 tubulin 14-bp testis motif (β2-tubulin gene upstream element 1 [β2-UE1]). Comparative sequence analyses reveal that this motif was already present before the Dntf-2r insertion and was likely driving the transcription of a noncoding RNA. We also show that the β2-UE1 occurs in the regulatory regions of other testis-specific retrogenes, and is functional in either orientation. In contrast, the Pros28.1A testes regulatory region in D. melanogaster appears to be novel. Only Pros28.1B, an older paralog of the Pros28.1 gene family, seems to carry a similar regulatory sequence. It is unclear how the Pros28.1A regulatory region was acquired in D. melanogaster, but it might have evolved de novo from within a region that may have been preprimed for testes expression. We conclude that relocation is critical for the evolutionary origin of male germline-specific cis-regulatory regions of retrogenes because expression depends on either the site of the retrogene insertion or the sequence changes close to the TSS thereafter. As a consequence we infer that positive selection will play a role in the evolution of these regulatory regions and can often act from the moment of the retrocopy insertion. PMID:24855141

  16. How petals change their spots: cis-regulatory re-wiring in Clarkia (Onagraceae).

    PubMed

    Martins, Talline R; Jiang, Peng; Rausher, Mark D

    2016-09-06

    A long-standing question in evolutionary developmental biology is how new traits evolve. Although most floral pigmentation studies have focused on how pigment intensity and composition diversify, few, if any, have explored how a pattern element can shift position. In the present study, we examine the genetic changes underlying shifts in the position of petal spots in Clarkia. Comparative transcriptome analyses were used to identify potential candidate genes responsible for spot formation. Co-segregation analyses in F2 individuals segregating for different spot positions, quantitative PCR, and pyrosequencing, were used to confirm the role of the candidate gene in determining spot position. Transient expression assays were used to identify the expression domain of different alleles. An R2R3Myb transcription factor (CgMyb1) activated spot formation, and different alleles of CgMyb1 were expressed in different domains, leading to spot formation in different petal locations. Reporter assays revealed that promoters from different alleles determine different locations of expression. The evolutionary shift in spot position is due to one or more cis-regulatory changes in the promoter of CgMyb1, indicating that shifts in pattern element position can be caused by changes in a single gene, and that cis-regulatory rewiring can be used to alter the relative position of an existing character.

  17. Conservation and evolution of cis-regulatory systems in ascomycete fungi

    SciTech Connect

    Gasch, Audrey P.; Moses, Alan M.; Chiang, Derek Y.; Fraser, Hunter B.; Berardini, Mark; Eisen, Michael B.

    2004-03-15

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

  18. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short.

  19. The evolution of cichlid fish egg-spots is linked with a cis-regulatory change.

    PubMed

    Santos, M Emília; Braasch, Ingo; Boileau, Nicolas; Meyer, Britta S; Sauteur, Loïc; Böhne, Astrid; Belting, Heinz-Georg; Affolter, Markus; Salzburger, Walter

    2014-10-09

    The origin of novel phenotypic characters is a key component in organismal diversification; yet, the mechanisms underlying the emergence of such evolutionary novelties are largely unknown. Here we examine the origin of egg-spots, an evolutionary innovation of the most species-rich group of cichlids, the haplochromines, where these conspicuous male fin colour markings are involved in mating. Applying a combination of RNAseq, comparative genomics and functional experiments, we identify two novel pigmentation genes, fhl2a and fhl2b, and show that especially the more rapidly evolving b-paralog is associated with egg-spot formation. We further find that egg-spot bearing haplochromines, but not other cichlids, feature a transposable element in the cis-regulatory region of fhl2b. Using transgenic zebrafish, we finally demonstrate that this region shows specific enhancer activities in iridophores, a type of pigment cells found in egg-spots, suggesting that a cis-regulatory change is causally linked to the gain of expression in egg-spot bearing haplochromines.

  20. Cell-type specific cis-regulatory networks: insights from Hox transcription factors.

    PubMed

    Polychronidou, Maria; Lohmann, Ingrid

    2013-01-01

    Hox proteins are a prominent class of transcription factors that specify cell and tissue identities in animal embryos. In sharp contrast to tissue-specifically expressed transcription factors, which coordinate regulatory pathways leading to the differentiation of a selected tissue, Hox proteins are active in many different cell types but are nonetheless able to differentially regulate gene expression in a context-dependent manner. This particular feature makes Hox proteins ideal candidates for elucidating the mechanisms employed by transcription factors to achieve tissue-specific functions in multi-cellular organisms. Here we discuss how the recent genome-wide identification and characterization of Hox cis-regulatory elements has provided insight concerning the molecular mechanisms underlying the high spatiotemporal specificity of Hox proteins. In particular, it was shown that Hox transcriptional outputs depend on the cell-type specific interplay of the different Hox proteins with co-regulatory factors as well as with epigenetic modifiers. Based on these observations it becomes clear that cell-type specific approaches are required for dissecting the tissue-specific Hox regulatory code. Identification and comparative analysis of Hox cis-regulatory elements driving target gene expression in different cell types in combination with analyses on how cofactors, epigenetic modifiers and protein-protein interactions mediate context-dependent Hox function will elucidate the mechanistic basis of tissue-specific gene regulation.

  1. Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape

    PubMed Central

    Verdin, Hannah; Fernández-Miñán, Ana; Benito-Sanz, Sara; Janssens, Sandra; Callewaert, Bert; Waele, Kathleen De; Schepper, Jean De; François, Inge; Menten, Björn; Heath, Karen E.; Gómez-Skarmeta, José Luis; Baere, Elfride De

    2015-01-01

    Genetic defects such as copy number variations (CNVs) in non-coding regions containing conserved non-coding elements (CNEs) outside the transcription unit of their target gene, can underlie genetic disease. An example of this is the short stature homeobox (SHOX) gene, regulated by seven CNEs located downstream and upstream of SHOX, with proven enhancer capacity in chicken limbs. CNVs of the downstream CNEs have been reported in many idiopathic short stature (ISS) cases, however, only recently have a few CNVs of the upstream enhancers been identified. Here, we set out to provide insight into: (i) the cis-regulatory role of these upstream CNEs in human cells, (ii) the prevalence of upstream CNVs in ISS, and (iii) the chromatin architecture of the SHOX cis-regulatory landscape in chicken and human cells. Firstly, luciferase assays in human U2OS cells, and 4C-seq both in chicken limb buds and human U2OS cells, demonstrated cis-regulatory enhancer capacities of the upstream CNEs. Secondly, CNVs of these upstream CNEs were found in three of 501 ISS patients. Finally, our 4C-seq interaction map of the SHOX region reveals a cis-regulatory domain spanning more than 1 Mb and harbouring putative new cis-regulatory elements. PMID:26631348

  2. Evolutionary analysis of the cis-regulatory region of the spicule matrix gene SM50 in strongylocentrotid sea urchins.

    PubMed

    Walters, Jenna; Binkley, Elaine; Haygood, Ralph; Romano, Laura A

    2008-03-15

    An evolutionary analysis of transcriptional regulation is essential to understanding the molecular basis of phenotypic diversity. The sea urchin is an ideal system in which to explore the functional consequence of variation in cis-regulatory sequences. We are particularly interested in the evolution of genes involved in the patterning and synthesis of its larval skeleton. This study focuses on the cis-regulatory region of SM50, which has already been characterized to a considerable extent in the purple sea urchin, Strongylocentrotus purpuratus. We have isolated the cis-regulatory region from 15 individuals of S. purpuratus as well as seven closely related species in the family Strongylocentrotidae. We have performed a variety of statistical tests and present evidence that the cis-regulatory elements upstream of the SM50 gene have been subject to positive selection along the lineage leading to S. purpuratus. In addition, we have performed electrophoretic mobility shift assays (EMSAs) and demonstrate that nucleotide substitutions within Element C affect the ability of nuclear proteins to bind to this cis-regulatory element among members of the family Strongylocentrotidae. We speculate that such changes in SM50 and other genes could accumulate to produce altered patterns of gene expression with functional consequences during skeleton formation.

  3. Does Positive Selection Drive Transcription Factor Binding Site Turnover? A Test with Drosophila Cis-Regulatory Modules

    PubMed Central

    He, Bin Z.; Holloway, Alisha K.; Maerkl, Sebastian J.; Kreitman, Martin

    2011-01-01

    Transcription factor binding site(s) (TFBS) gain and loss (i.e., turnover) is a well-documented feature of cis-regulatory module (CRM) evolution, yet little attention has been paid to the evolutionary force(s) driving this turnover process. The predominant view, motivated by its widespread occurrence, emphasizes the importance of compensatory mutation and genetic drift. Positive selection, in contrast, although it has been invoked in specific instances of adaptive gene expression evolution, has not been considered as a general alternative to neutral compensatory evolution. In this study we evaluate the two hypotheses by analyzing patterns of single nucleotide polymorphism in the TFBS of well-characterized CRM in two closely related Drosophila species, Drosophila melanogaster and Drosophila simulans. An important feature of the analysis is classification of TFBS mutations according to the direction of their predicted effect on binding affinity, which allows gains and losses to be evaluated independently along the two phylogenetic lineages. The observed patterns of polymorphism and divergence are not compatible with neutral evolution for either class of mutations. Instead, multiple lines of evidence are consistent with contributions of positive selection to TFBS gain and loss as well as purifying selection in its maintenance. In discussion, we propose a model to reconcile the finding of selection driving TFBS turnover with constrained CRM function over long evolutionary time. PMID:21572512

  4. Establishment of a Developmental Compartment Requires Interactions between Three Synergistic Cis-regulatory Modules

    PubMed Central

    Bieli, Dimitri; Kanca, Oguz; Requena, David; Hamaratoglu, Fisun; Gohl, Daryl; Schedl, Paul; Affolter, Markus; Slattery, Matthew; Müller, Martin; Estella, Carlos

    2015-01-01

    The subdivision of cell populations in compartments is a key event during animal development. In Drosophila, the gene apterous (ap) divides the wing imaginal disc in dorsal vs ventral cell lineages and is required for wing formation. ap function as a dorsal selector gene has been extensively studied. However, the regulation of its expression during wing development is poorly understood. In this study, we analyzed ap transcriptional regulation at the endogenous locus and identified three cis-regulatory modules (CRMs) essential for wing development. Only when the three CRMs are combined, robust ap expression is obtained. In addition, we genetically and molecularly analyzed the trans-factors that regulate these CRMs. Our results propose a three-step mechanism for the cell lineage compartment expression of ap that includes initial activation, positive autoregulation and Trithorax-mediated maintenance through separable CRMs. PMID:26468882

  5. Massively parallel cis-regulatory analysis in the mammalian central nervous system.

    PubMed

    Shen, Susan Q; Myers, Connie A; Hughes, Andrew E O; Byrne, Leah C; Flannery, John G; Corbo, Joseph C

    2016-02-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell-derived organoids.

  6. The Cis-regulatory Logic of the Mammalian Photoreceptor Transcriptional Network

    PubMed Central

    Hsiau, Timothy H.-C.; Diaconu, Claudiu; Myers, Connie A.; Lee, Jongwoo; Cepko, Constance L.; Corbo, Joseph C.

    2007-01-01

    The photoreceptor cells of the retina are subject to a greater number of genetic diseases than any other cell type in the human body. The majority of more than 120 cloned human blindness genes are highly expressed in photoreceptors. In order to establish an integrative framework in which to understand these diseases, we have undertaken an experimental and computational analysis of the network controlled by the mammalian photoreceptor transcription factors, Crx, Nrl, and Nr2e3. Using microarray and in situ hybridization datasets we have produced a model of this network which contains over 600 genes, including numerous retinal disease loci as well as previously uncharacterized photoreceptor transcription factors. To elucidate the connectivity of this network, we devised a computational algorithm to identify the photoreceptor-specific cis-regulatory elements (CREs) mediating the interactions between these transcription factors and their target genes. In vivo validation of our computational predictions resulted in the discovery of 19 novel photoreceptor-specific CREs near retinal disease genes. Examination of these CREs permitted the definition of a simple cis-regulatory grammar rule associated with high-level expression. To test the generality of this rule, we used an expanded form of it as a selection filter to evolve photoreceptor CREs from random DNA sequences in silico. When fused to fluorescent reporters, these evolved CREs drove strong, photoreceptor-specific expression in vivo. This study represents the first systematic identification and in vivo validation of CREs in a mammalian neuronal cell type and lays the groundwork for a systems biology of photoreceptor transcriptional regulation. PMID:17653270

  7. Intronic Cis-Regulatory Modules Mediate Tissue-Specific and Microbial Control of angptl4/fiaf Transcription

    PubMed Central

    Camp, J. Gray; Jazwa, Amelia L.; Trent, Chad M.; Rawls, John F.

    2012-01-01

    The intestinal microbiota enhances dietary energy harvest leading to increased fat storage in adipose tissues. This effect is caused in part by the microbial suppression of intestinal epithelial expression of a circulating inhibitor of lipoprotein lipase called Angiopoietin-like 4 (Angptl4/Fiaf). To define the cis-regulatory mechanisms underlying intestine-specific and microbial control of Angptl4 transcription, we utilized the zebrafish system in which host regulatory DNA can be rapidly analyzed in a live, transparent, and gnotobiotic vertebrate. We found that zebrafish angptl4 is transcribed in multiple tissues including the liver, pancreatic islet, and intestinal epithelium, which is similar to its mammalian homologs. Zebrafish angptl4 is also specifically suppressed in the intestinal epithelium upon colonization with a microbiota. In vivo transgenic reporter assays identified discrete tissue-specific regulatory modules within angptl4 intron 3 sufficient to drive expression in the liver, pancreatic islet β-cells, or intestinal enterocytes. Comparative sequence analyses and heterologous functional assays of angptl4 intron 3 sequences from 12 teleost fish species revealed differential evolution of the islet and intestinal regulatory modules. High-resolution functional mapping and site-directed mutagenesis defined the minimal set of regulatory sequences required for intestinal activity. Strikingly, the microbiota suppressed the transcriptional activity of the intestine-specific regulatory module similar to the endogenous angptl4 gene. These results suggest that the microbiota might regulate host intestinal Angptl4 protein expression and peripheral fat storage by suppressing the activity of an intestine-specific transcriptional enhancer. This study provides a useful paradigm for understanding how microbial signals interact with tissue-specific regulatory networks to control the activity and evolution of host gene transcription. PMID:22479192

  8. Directed Network Motifs in Alzheimer’s Disease and Mild Cognitive Impairment

    PubMed Central

    Friedman, Eric J.; Young, Karl; Tremper, Graham; Liang, Jason; Landsberg, Adam S.; Schuff, Norbert

    2015-01-01

    Directed network motifs are the building blocks of complex networks, such as human brain networks, and capture deep connectivity information that is not contained in standard network measures. In this paper we present the first application of directed network motifs in vivo to human brain networks, utilizing recently developed directed progression networks which are built upon rates of cortical thickness changes between brain regions. This is in contrast to previous studies which have relied on simulations and in vitro analysis of non-human brains. We show that frequencies of specific directed network motifs can be used to distinguish between patients with Alzheimer’s disease (AD) and normal control (NC) subjects. Especially interesting from a clinical standpoint, these motif frequencies can also distinguish between subjects with mild cognitive impairment who remained stable over three years (MCI) and those who converted to AD (CONV). Furthermore, we find that the entropy of the distribution of directed network motifs increased from MCI to CONV to AD, implying that the distribution of pathology is more structured in MCI but becomes less so as it progresses to CONV and further to AD. Thus, directed network motifs frequencies and distributional properties provide new insights into the progression of Alzheimer’s disease as well as new imaging markers for distinguishing between normal controls, stable mild cognitive impairment, MCI converters and Alzheimer’s disease. PMID:25879535

  9. Distal cis-regulatory elements are required for tissue-specific expression of enamelin (Enam)

    PubMed Central

    Hu, Yuanyuan; Papagerakis, Petros; Ye, Ling; Feng, Jerry Q.; Simmer, James P.; Hu, Jan C-C.

    2009-01-01

    Enamel formation is orchestrated by the sequential expression of genes encoding enamel matrix proteins; however, the mechanisms sustaining the spatio–temporal order of gene transcription during amelogenesis are poorly understood. The aim of this study was to characterize the cis-regulatory sequences necessary for normal expression of enamelin (Enam). Several enamelin transcription regulatory regions, showing high sequence homology among species, were identified. DNA constructs containing 5.2 or 3.9 kb regions upstream of the enamelin translation initiation site were linked to a LacZ reporter and used to generate transgenic mice. Only the 5.2-Enam–LacZ construct was sufficient to recapitulate the endogenous pattern of enamelin tooth-specific expression. The 3.9-Enam–LacZ transgenic lines showed no expression in dental cells, but ectopic β-galactosidase activity was detected in osteoblasts. Potential transcription factor-binding sites were identified that may be important in controlling enamelin basal promoter activity and in conferring enamelin tissue-specific expression. Our study provides new insights into regulatory mechanisms governing enamelin expression. PMID:18353004

  10. Genetic Analysis of Transvection Effects Involving Cis-Regulatory Elements of the Drosophila Ultrabithorax Gene

    PubMed Central

    Micol, J. L.; Castelli-Gair, J. E.; Garcia-Bellido, A.

    1990-01-01

    The Ultrabithorax (Ubx) gene of Drosophila melanogaster contains two functionally distinguishable regions: the protein-coding Ubx transcription unit and, upstream of it, the transcribed but non-protein-coding bxd region. Numerous recessive, partial loss-of-function mutations which appear to be regulatory mutations map within the bxd region and within the introns of the Ubx transcription unit. In addition, mutations within the Ubx unit exons are known and most of these behave as null alleles. Ubx(1) is one such allele. We have confirmed that, although the Ubx(1) allele does not produce detectable Ubx proteins (UBX), it does retain other genetic functions detectable by their effects on the expression of a paired, homologous Ubx allele, i.e., by transvection. We have extended previous analyses made by E. B. Lewis by mapping the critical elements of the Ubx gene which participate in transvection effects. Our results show that the Ubx(1) allele retains wild-type functions whose effectiveness can be reduced (1) by additional cis mutations in the bxd region or in introns of the Ubx transcription unit, as well as (2) by rearrangements disturbing pairing between homologous Ubx genes. Our results suggest that those remnant functions in Ubx(1) are able to modulate the activity of the allele located in the homologous chromosome. We discuss the normal cis regulatory role of these functions involved in trans interactions between homologous Ubx genes, as well as the implications of our results for the current models on transvection. PMID:2123161

  11. PReMod: a database of genome-wide mammalian cis-regulatory module predictions.

    PubMed

    Ferretti, Vincent; Poitras, Christian; Bergeron, Dominique; Coulombe, Benoit; Robert, François; Blanchette, Mathieu

    2007-01-01

    We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.

  12. Genome-wide Computational Analysis Reveals Cardiomyocyte-specific Transcriptional Cis-regulatory Motifs That Enable Efficient Cardiac Gene Therapy

    PubMed Central

    Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K

    2015-01-01

    Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a “molecular signature” associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy. PMID:25195597

  13. Genome-wide computational analysis reveals cardiomyocyte-specific transcriptional Cis-regulatory motifs that enable efficient cardiac gene therapy.

    PubMed

    Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K

    2015-01-01

    Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a "molecular signature" associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy.

  14. Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

    PubMed

    Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

    2015-08-15

    Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan.

  15. cis-Regulatory Circuits Regulating NEK6 Kinase Overexpression in Transformed B Cells Are Super-Enhancer Independent.

    PubMed

    Huang, Yue; Koues, Olivia I; Zhao, Jiang-Yang; Liu, Regina; Pyfrom, Sarah C; Payton, Jacqueline E; Oltz, Eugene M

    2017-03-21

    Alterations in distal regulatory elements that control gene expression underlie many diseases, including cancer. Epigenomic analyses of normal and diseased cells have produced correlative predictions for connections between dysregulated enhancers and target genes involved in pathogenesis. However, with few exceptions, these predicted cis-regulatory circuits remain untested. Here, we dissect cis-regulatory circuits that lead to overexpression of NEK6, a mitosis-associated kinase, in human B cell lymphoma. We find that only a minor subset of predicted enhancers is required for NEK6 expression. Indeed, an annotated super-enhancer is dispensable for NEK6 overexpression and for maintaining the architecture of a B cell-specific regulatory hub. A CTCF cluster serves as a chromatin and architectural boundary to block communication of the NEK6 regulatory hub with neighboring genes. Our findings emphasize that validation of predicted cis-regulatory circuits and super-enhancers is needed to prioritize transcriptional control elements as therapeutic targets.

  16. Cis-regulatory elements are harbored in Intron5 of the RUNX1 gene

    PubMed Central

    2014-01-01

    Background Human RUNX1 gene is one of the most frequent target for chromosomal translocations associated with acute myeloid leukemia (AML) and acute lymphoid leukemia (ALL). The highest prevalence in AML is noted with (8; 21) translocation; which represents 12 to 15% of all AML cases. Interestingly, all the breakpoints mapped to date in t(8;21) are clustered in intron 5 of the RUNX1 gene and intron 1 of the ETO gene. No homologous sequences have been found at the recombination regions; but DNase I hypersensitive sites (DHS) have been mapped to the areas of the genes involved in t(8;21). Presence of DHS sites is commonly associated with regulatory elements such as promoters, enhancers and silencers, among others. Results In this study we used a combination of comparative genomics, cloning and transfection assays to evaluate potential regulatory elements located in intron 5 of the RUNX1 gene. Our genomic analysis identified nine conserved non-coding sequences that are evolutionarily conserved among rat, mouse and human. We cloned two of these regions in pGL-3 Promoter plasmid in order to analyze their transcriptional regulatory activity. Our results demonstrate that the identified regions can indeed regulate transcription of a reporter gene in a distance and position independent manner; moreover, their transcriptional effect is cell type specific. Conclusions We have identified nine conserved non coding sequence that are harbored in intron 5 of the RUNX1 gene. We have also demonstrated that two of these regions can regulate transcriptional activity in vitro. Taken together our results suggest that intron 5 of the RUNX1 gene contains multiple potential cis-regulatory elements. PMID:24655352

  17. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition

    PubMed Central

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  18. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii

    PubMed Central

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  19. Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

    DOE PAGES

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; ...

    2016-03-26

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less

  20. Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution

    PubMed Central

    Janky, Rekin's; van Helden, Jacques

    2008-01-01

    Background The detection of conserved motifs in promoters of orthologous genes (phylogenetic footprints) has become a common strategy to predict cis-acting regulatory elements. Several software tools are routinely used to raise hypotheses about regulation. However, these tools are generally used as black boxes, with default parameters. A systematic evaluation of optimal parameters for a footprint discovery strategy can bring a sizeable improvement to the predictions. Results We evaluate the performances of a footprint discovery approach based on the detection of over-represented spaced motifs. This method is particularly suitable for (but not restricted to) Bacteria, since such motifs are typically bound by factors containing a Helix-Turn-Helix domain. We evaluated footprint discovery in 368 Escherichia coli K12 genes with annotated sites, under 40 different combinations of parameters (taxonomical level, background model, organism-specific filtering, operon inference). Motifs are assessed both at the levels of correctness and significance. We further report a detailed analysis of 181 bacterial orthologs of the LexA repressor. Distinct motifs are detected at various taxonomical levels, including the 7 previously characterized taxon-specific motifs. In addition, we highlight a significantly stronger conservation of half-motifs in Actinobacteria, relative to Firmicutes, suggesting an intermediate state in specificity switching between the two Gram-positive phyla, and thereby revealing the on-going evolution of LexA auto-regulation. Conclusion The footprint discovery method proposed here shows excellent results with E. coli and can readily be extended to predict cis-acting regulatory signals and propose testable hypotheses in bacterial genomes for which nothing is known about regulation. PMID:18215291

  1. Asymmetrically reduced expression of hand1 homeologs involving a single nucleotide substitution in a cis-regulatory element.

    PubMed

    Ochi, Haruki; Suzuki, Nanoka; Kawaguchi, Akane; Ogino, Hajime

    2017-03-28

    During vertebrate evolution, whole genome duplications resulted in a number of duplicated genes, some of which eventually changed their expression patterns and/or levels via alteration of cis-regulatory sequences. However, the initial process involved in such cis-regulatory changes remains unclear. Therefore, we investigated this process by analyzing the duplicated hand1 genes of Xenopus laevis (hand1.L and hand1.S), which were generated by allotetraploidization 17-18 million years ago, and compared these with their single ortholog in the ancestral-type diploid species X. tropicalis. A dN/dS analysis indicated that hand1.L and hand1.S are still under purifying selection, and thus, their products appear to retain ancestral functional properties. RNA-seq and in situ hybridization analyses revealed that hand1.L and hand1.S have similar expression patterns to each other and to X. tropicalis hand1, but the hand1.S expression level was much lower than the hand1.L expression level in the primordial heart. A comparative sequence analysis, luciferase reporter analysis, ChIP-PCR analysis, and transgenic reporter analysis showed that a single nucleotide substitution in the hand1.S promoter was responsible for the reduced expression in the heart. These findings demonstrated that a small change in the promoter sequence can trigger diversification of duplicated gene expression prior to diversification of their encoded protein functions in a young duplicated genome.

  2. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  3. An ancient yet flexible cis-regulatory architecture allows localized Hedgehog tuning by patched/Ptch1

    PubMed Central

    Lorberbaum, David S; Ramos, Andrea I; Peterson, Kevin A; Carpenter, Brandon S; Parker, David S; De, Sandip; Hillers, Lauren E; Blake, Victoria M; Nishi, Yuichi; McFarlane, Matthew R; Chiang, Ason CY; Kassis, Judith A; Allen, Benjamin L; McMahon, Andrew P; Barolo, Scott

    2016-01-01

    The Hedgehog signaling pathway is part of the ancient developmental-evolutionary animal toolkit. Frequently co-opted to pattern new structures, the pathway is conserved among eumetazoans yet flexible and pleiotropic in its effects. The Hedgehog receptor, Patched, is transcriptionally activated by Hedgehog, providing essential negative feedback in all tissues. Our locus-wide dissections of the cis-regulatory landscapes of fly patched and mouse Ptch1 reveal abundant, diverse enhancers with stage- and tissue-specific expression patterns. The seemingly simple, constitutive Hedgehog response of patched/Ptch1 is driven by a complex regulatory architecture, with batteries of context-specific enhancers engaged in promoter-specific interactions to tune signaling individually in each tissue, without disturbing patterning elsewhere. This structure—one of the oldest cis-regulatory features discovered in animal genomes—explains how patched/Ptch1 can drive dramatic adaptations in animal morphology while maintaining its essential core function. It may also suggest a general model for the evolutionary flexibility of conserved regulators and pathways. DOI: http://dx.doi.org/10.7554/eLife.13550.001 PMID:27146892

  4. Modular Utilization of Distal cis-Regulatory Elements Controls Ifng Gene Expression in T Cells Activated by Distinct Stimuli

    PubMed Central

    Balasubramani, Anand; Shibata, Yoichiro; Crawford, Gregory E.; Baldwin, Albert S.; Hatton, Robin D.; Weaver, Casey T.

    2010-01-01

    SUMMARY Distal cis-regulatory elements play essential roles in the T lineage-specific expression of cytokine genes. We have mapped interactions of three transacting factors – NF-κB, STAT4 and T-bet – with cis elements in the Ifng locus. We find that RelA is critical for optimal Ifng expression and is differentially recruited to multiple elements contingent upon T cell receptor (TCR) or interleukin-12 (IL-12) plus IL-18 signaling. RelA recruitment to at least four elements is dependent on T-bet-dependent remodeling of the Ifng locus and co-recruitment of STAT4. STAT4 and NF-κB therefore cooperate at multiple cis elements to enable NF-κB–dependent enhancement of Ifng expression. RelA recruitment to distal elements was similar in Th1 and Tc1 effector cells, although T-bet was dispensable in CD8 effectors. These results support a model of Ifng regulation in which distal cis-regulatory elements differentially recruit key transcription factors in a modular fashion to initiate gene transcription induced by distinct activation signals. PMID:20643337

  5. Differential contribution of cis-regulatory elements to higher order chromatin structure and expression of the CFTR locus

    PubMed Central

    Yang, Rui; Kerschner, Jenny L.; Gosalia, Nehal; Neems, Daniel; Gorsic, Lidija K.; Safi, Alexias; Crawford, Gregory E.; Kosak, Steven T.; Leir, Shih-Hsing; Harris, Ann

    2016-01-01

    Higher order chromatin structure establishes domains that organize the genome and coordinate gene expression. However, the molecular mechanisms controlling transcription of individual loci within a topological domain (TAD) are not fully understood. The cystic fibrosis transmembrane conductance regulator (CFTR) gene provides a paradigm for investigating these mechanisms. CFTR occupies a TAD bordered by CTCF/cohesin binding sites within which are cell-type-selective cis-regulatory elements for the locus. We showed previously that intronic and extragenic enhancers, when occupied by specific transcription factors, are recruited to the CFTR promoter by a looping mechanism to drive gene expression. Here we use a combination of CRISPR/Cas9 editing of cis-regulatory elements and siRNA-mediated depletion of architectural proteins to determine the relative contribution of structural elements and enhancers to the higher order structure and expression of the CFTR locus. We found the boundaries of the CFTR TAD are conserved among diverse cell types and are dependent on CTCF and cohesin complex. Removal of an upstream CTCF-binding insulator alters the interaction profile, but has little effect on CFTR expression. Within the TAD, intronic enhancers recruit cell-type selective transcription factors and deletion of a pivotal enhancer element dramatically decreases CFTR expression, but has minor effect on its 3D structure. PMID:26673704

  6. Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes in yeast.

    PubMed

    Schaefke, Bernhard; Emerson, J J; Wang, Tzi-Yuan; Lu, Mei-Yeh Jade; Hsieh, Li-Ching; Li, Wen-Hsiung

    2013-09-01

    Gene expression evolution can be caused by changes in cis- or trans-regulatory elements or both. As cis and trans regulation operate through different molecular mechanisms, cis and trans mutations may show different inheritance patterns and may be subjected to different selective constraints. To investigate these issues, we obtained and analyzed gene expression data from two Saccharomyces cerevisiae strains and their hybrid, using high-throughput sequencing. Our data indicate that compared with other types of genes, those with antagonistic cis-trans interactions are more likely to exhibit over- or underdominant inheritance of expression level. Moreover, in accordance with previous studies, genes with trans variants tend to have a dominant inheritance pattern, whereas cis variants are enriched for additive inheritance. In addition, cis regulatory differences contribute more to expression differences between species than within species, whereas trans regulatory differences show a stronger association between divergence and polymorphism. Our data indicate that in the trans component of gene expression differences genes subjected to weaker selective constraints tend to have an excess of polymorphism over divergence compared with those subjected to stronger selective constraints. In contrast, in the cis component, this difference between genes under stronger and weaker selective constraint is mostly absent. To explain these observations, we propose that purifying selection more strongly shapes trans changes than cis changes and that positive selection may have significantly contributed to cis regulatory divergence.

  7. Numb directs the subcellular localization of EAAT3 through binding the YxNxxF motif.

    PubMed

    Su, Jin-Feng; Wei, Jian; Li, Pei-Shan; Miao, Hong-Hua; Ma, Yong-Chao; Qu, Yu-Xiu; Xu, Jie; Qin, Jie; Li, Bo-Liang; Song, Bao-Liang; Xu, Zheng-Ping; Luo, Jie

    2016-08-15

    Excitatory amino acid transporter type 3 (EAAT3, also known as SLC1A1) is a high-affinity, Na(+)-dependent glutamate carrier that localizes primarily within the cell and at the apical plasma membrane. Although previous studies have reported proteins and sequence regions involved in EAAT3 trafficking, the detailed molecular mechanism by which EAAT3 is distributed to the correct location still remains elusive. Here, we identify that the YVNGGF sequence in the C-terminus of EAAT3 is responsible for its intracellular localization and apical sorting in rat hepatoma cells CRL1601 and Madin-Darby canine kidney (MDCK) cells, respectively. We further demonstrate that Numb, a clathrin adaptor protein, directly binds the YVNGGF motif and regulates the localization of EAAT3. Mutation of Y503, N505 and F508 within the YVNGGF motif to alanine residues or silencing Numb by use of small interfering RNA (siRNA) results in the aberrant localization of EAAT3. Moreover, both Numb and the YVNGGF motif mediate EAAT3 endocytosis in CRL1601 cells. In summary, our study suggests that Numb is a pivotal adaptor protein that mediates the subcellular localization of EAAT3 through binding the YxNxxF (where x stands for any amino acid) motif.

  8. Long-range DNase I hypersensitivity mapping reveals the imprinted Igf2r and Air promoters share cis-regulatory elements

    PubMed Central

    Pauler, Florian M.; Stricker, Stefan H.; Warczok, Katarzyna E.; Barlow, Denise P.

    2005-01-01

    Epigenetic mechanisms restrict the expression of imprinted genes to one parental allele in diploid cells. At the Igf2r/Air imprinted cluster on mouse chromosome 17, paternal-specific expression of the Air noncoding RNA has been shown to silence three genes in cis: Igf2r, Slc22a2, and Slc22a3. By an unbiased mapping of DNase I hypersensitive sites (DHS) in a 192-kb region flanking Igf2r and Air, we identified 21 DHS, of which nine mapped to evolutionarily conserved sequences. Based on the hypothesis that silencing effects of Air would be directed towards cis regulatory elements used to activate genes, DHS are potential key players in the control of imprinted expression. However, in this 192-kb region only the two DHS mapping to the Igf2r and Air promoters show parental specificity. The remaining 19 DHS were present on both parental alleles and, thus, have the potential to activate Igf2r on the maternal allele and Air on the paternal allele. The possibility that the Igf2r and Air promoters share the same cis-acting regulatory elements, albeit on opposite parental chromosomes, was supported by the similar expression profiles of Igf2r and Air in vivo. These results refine our understanding of the onset of imprinted silencing at this cluster and indicate the Air noncoding RNA may specifically target silencing to the Igf2r promoter. PMID:16204191

  9. Shared Enhancer Activity in the Limbs and Phallus and Functional Divergence of a Limb-Genital cis-Regulatory Element in Snakes.

    PubMed

    Infante, Carlos R; Mihala, Alexandra G; Park, Sungdae; Wang, Jialiang S; Johnson, Kenji K; Lauderdale, James D; Menke, Douglas B

    2015-10-12

    The amniote phallus and limbs differ dramatically in their morphologies but share patterns of signaling and gene expression in early development. Thus far, the extent to which genital and limb transcriptional networks also share cis-regulatory elements has remained unexplored. We show that many limb enhancers are retained in snake genomes, suggesting that these elements may function in non-limb tissues. Consistent with this, our analysis of cis-regulatory activity in mice and Anolis lizards reveals that patterns of enhancer activity in embryonic limbs and genitalia overlap heavily. In mice, deletion of HLEB, an enhancer of Tbx4, produces defects in hindlimbs and genitalia, establishing the importance of this limb-genital enhancer for development of these different appendages. Further analyses demonstrate that the HLEB of snakes has lost hindlimb enhancer function while retaining genital activity. Our findings identify roles for Tbx4 in genital development and highlight deep similarities in cis-regulatory activity between limbs and genitalia.

  10. Multiple cis-regulatory elements are involved in the complex regulation of the sieve element-specific MtSEO-F1 promoter from Medicago truncatula.

    PubMed

    Bucsenez, M; Rüping, B; Behrens, S; Twyman, R M; Noll, G A; Prüfer, D

    2012-09-01

    The sieve element occlusion (SEO) gene family includes several members that are expressed specifically in immature sieve elements (SEs) in the developing phloem of dicotyledonous plants. To determine how this restricted expression profile is achieved, we analysed the SE-specific Medicago truncatula SEO-F1 promoter (PMtSEO-F1) by constructing deletion, substitution and hybrid constructs and testing them in transgenic tobacco plants using green fluorescent protein as a reporter. This revealed four promoter regions, each containing cis-regulatory elements that activate transcription in SEs. One of these segments also contained sufficient information to suppress PMtSEO-F1 transcription in the phloem companion cells (CCs). Subsequent in silico analysis revealed several candidate cis-regulatory elements that PMtSEO-F1 shares with other SEO promoters. These putative sieve element boxes (PSE boxes) are promising candidates for cis-regulatory elements controlling the SE-specific expression of PMtSEO-F1.

  11. Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment.

    PubMed

    Hughes, Jim R; Roberts, Nigel; McGowan, Simon; Hay, Deborah; Giannoulatou, Eleni; Lynch, Magnus; De Gobbi, Marco; Taylor, Stephen; Gibbons, Richard; Higgs, Douglas R

    2014-02-01

    Gene expression during development and differentiation is regulated in a cell- and stage-specific manner by complex networks of intergenic and intragenic cis-regulatory elements whose numbers and representation in the genome far exceed those of structural genes. Using chromosome conformation capture, it is now possible to analyze in detail the interaction between enhancers, silencers, boundary elements and promoters at individual loci, but these techniques are not readily scalable. Here we present a high-throughput approach (Capture-C) to analyze cis interactions, interrogating hundreds of specific interactions at high resolution in a single experiment. We show how this approach will facilitate detailed, genome-wide analysis to elucidate the general principles by which cis-acting sequences control gene expression. In addition, we show how Capture-C will expedite identification of the target genes and functional effects of SNPs that are associated with complex diseases, which most frequently lie in intergenic cis-acting regulatory elements.

  12. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  13. Two RNA-binding motifs in eIF3 direct HCV IRES-dependent translation

    PubMed Central

    Sun, Chaomin; Querol-Audí, Jordi; Mortimer, Stefanie A.; Arias-Palomo, Ernesto; Doudna, Jennifer A.; Nogales, Eva; Cate, Jamie H. D.

    2013-01-01

    The initiation of protein synthesis plays an essential regulatory role in human biology. At the center of the initiation pathway, the 13-subunit eukaryotic translation initiation factor 3 (eIF3) controls access of other initiation factors and mRNA to the ribosome by unknown mechanisms. Using electron microscopy (EM), bioinformatics and biochemical experiments, we identify two highly conserved RNA-binding motifs in eIF3 that direct translation initiation from the hepatitis C virus internal ribosome entry site (HCV IRES) RNA. Mutations in the RNA-binding motif of subunit eIF3a weaken eIF3 binding to the HCV IRES and the 40S ribosomal subunit, thereby suppressing eIF2-dependent recognition of the start codon. Mutations in the eIF3c RNA-binding motif also reduce 40S ribosomal subunit binding to eIF3, and inhibit eIF5B-dependent steps downstream of start codon recognition. These results provide the first connection between the structure of the central translation initiation factor eIF3 and recognition of the HCV genomic RNA start codon, molecular interactions that likely extend to the human transcriptome. PMID:23766293

  14. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  15. Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model.

    PubMed

    Halfon, Marc S; Grad, Yonatan; Church, George M; Michelson, Alan M

    2002-07-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development.

  16. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period

    PubMed Central

    Tena, Juan J.; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W.; Rigby, Peter W.J.; Carvajal, Jaime J.; Wittbrodt, Joachim; Gómez-Skarmeta, José L.; Martínez-Morales, Juan R.

    2014-01-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer’s formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115–200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan. PMID:24709821

  17. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period.

    PubMed

    Tena, Juan J; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W; Rigby, Peter W J; Carvajal, Jaime J; Wittbrodt, Joachim; Gómez-Skarmeta, José L; Martínez-Morales, Juan R

    2014-07-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer's formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115-200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan.

  18. A comparative analysis of the evolution, expression, and cis-regulatory element of polygalacturonase genes in grasses and dicots.

    PubMed

    Liang, Ying; Yu, Youjian; Cui, Jinlong; Lyu, Meiling; Xu, Liai; Cao, Jiashu

    2016-11-01

    Cell walls are a distinguishing characteristic of plants essential to their survival. The pectin content of primary cell walls in grasses and dicots is distinctly different. Polygalacturonases (PGs) can degrade pectins and participate in multiple developmental processes of plants. This study comprehensively compared the evolution, expression, and cis-regulatory element of PGs in grasses and dicots. A total of 577 PGs identified from five grasses and five dicots fell into seven clades. Evolutionary analysis demonstrated the distinct differences between grasses and dicots in patterns of gene duplication and loss, and evolutionary rates. Grasses generally contained much fewer clade C and F members than dicots. We found that this disparity was the result of less duplication and more gene losses in grasses. More duplications occurred in clades D and E, and expression analysis showed that most of clade E members were expressed ubiquitously at a high overall level and clade D members were closely related to male reproduction in both grasses and dicots, suggesting their biological functions were highly conserved across species. In addition to the general role in reproductive development, PGs of clades C and F specifically played roles in root development in dicots, shedding light on organ differentiation between the two groups of plants. A regulatory element analysis of clade C and F members implied that possible functions of PGs in specific biological responses contributed to their expansion and preservation. This work can improve the knowledge of PGs in plants generally and in grasses specifically and is beneficial to functional studies.

  19. Functional roles of Aves class-specific cis-regulatory elements on macroevolution of bird-specific features.

    PubMed

    Seki, Ryohei; Li, Cai; Fang, Qi; Hayashi, Shinichi; Egawa, Shiro; Hu, Jiang; Xu, Luohao; Pan, Hailin; Kondo, Mao; Sato, Tomohiko; Matsubara, Haruka; Kamiyama, Namiko; Kitajima, Keiichi; Saito, Daisuke; Liu, Yang; Gilbert, M Thomas P; Zhou, Qi; Xu, Xing; Shiroishi, Toshihiko; Irie, Naoki; Tamura, Koji; Zhang, Guojie

    2017-02-06

    Unlike microevolutionary processes, little is known about the genetic basis of macroevolutionary processes. One of these magnificent examples is the transition from non-avian dinosaurs to birds that has created numerous evolutionary innovations such as self-powered flight and its associated wings with flight feathers. By analysing 48 bird genomes, we identified millions of avian-specific highly conserved elements (ASHCEs) that predominantly (>99%) reside in non-coding regions. Many ASHCEs show differential histone modifications that may participate in regulation of limb development. Comparative embryonic gene expression analyses across tetrapod species suggest ASHCE-associated genes have unique roles in developing avian limbs. In particular, we demonstrate how the ASHCE driven avian-specific expression of gene Sim1 driven by ASHCE may be associated with the evolution and development of flight feathers. Together, these findings demonstrate regulatory roles of ASHCEs in the creation of avian-specific traits, and further highlight the importance of cis-regulatory rewiring during macroevolutionary changes.

  20. Retinal Expression of the Drosophila eyes absent Gene Is Controlled by Several Cooperatively Acting Cis-regulatory Elements

    PubMed Central

    Neuman, Sarah D.; Bashirullah, Arash; Kumar, Justin P.

    2016-01-01

    The eyes absent (eya) gene of the fruit fly, Drosophila melanogaster, is a member of an evolutionarily conserved gene regulatory network that controls eye formation in all seeing animals. The loss of eya leads to the complete elimination of the compound eye while forced expression of eya in non-retinal tissues is sufficient to induce ectopic eye formation. Within the developing retina eya is expressed in a dynamic pattern and is involved in tissue specification/determination, cell proliferation, apoptosis, and cell fate choice. In this report we explore the mechanisms by which eya expression is spatially and temporally governed in the developing eye. We demonstrate that multiple cis-regulatory elements function cooperatively to control eya transcription and that spacing between a pair of enhancer elements is important for maintaining correct gene expression. Lastly, we show that the loss of eya expression in sine oculis (so) mutants is the result of massive cell death and a progressive homeotic transformation of retinal progenitor cells into head epidermis. PMID:27930646

  1. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

    PubMed Central

    Karnik, Rahul; Beer, Michael A.

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  2. Extensive cis-Regulatory Variation Robust to Environmental Perturbation in Arabidopsis[W

    PubMed Central

    Cubillos, Francisco A.; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle

    2014-01-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments. PMID:25428981

  3. Functional characterisation of cis-regulatory elements governing dynamic Eomes expression in the early mouse embryo.

    PubMed

    Simon, Claire S; Downes, Damien J; Gosden, Matthew E; Telenius, Jelena; Higgs, Douglas R; Hughes, Jim R; Costello, Ita; Bikoff, Elizabeth K; Robertson, Elizabeth J

    2017-02-07

    The T-box transcription factor (TF) Eomes is a key regulator of cell fate decisions during early mouse development. The cis-acting regulatory elements that direct expression in the anterior visceral endoderm (AVE), primitive streak (PS) and definitive endoderm (DE) have yet to be defined. Here, we identified three gene-proximal enhancer-like sequences (PSE_a, PSE_b and VPE) that faithfully activate tissue specific expression in transgenic embryos. However, targeted deletion experiments demonstrate that PSE_a and PSE_b are dispensable and only the VPE is required for optimal Eomes expression in vivo Embryos lacking this enhancer display variably penetrant defects in anterior-posterior axis orientation and DE formation. Chromosome conformation capture experiments reveal VPE-promoter interactions embryonic stem cells (ESC), prior to gene activation. The locus resides in a large (500kb) pre-formed compartment in ESC and activation during DE differentiation occurs in the absence of 3D structural changes. ATAC-seq analysis reveals that VPE, PSE_a, and four additional putative enhancers display increased chromatin accessibility in DE associated with Smad2/3 binding coincident with transcriptional activation. In contrast, activation of the Eomes target genes Foxa2 and Lhx1 is associated with higher order chromatin reorganisation. Thus diverse regulatory mechanisms govern activation of lineage specifying TFs during early development.

  4. Extensive cis-regulatory variation robust to environmental perturbation in Arabidopsis.

    PubMed

    Cubillos, Francisco A; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle; Loudet, Olivier

    2014-11-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments.

  5. Cis-regulatory underpinnings of human GLI3 expression in embryonic craniofacial structures and internal organs.

    PubMed

    Abbasi, Amir A; Minhas, Rashid; Schmidt, Ansgar; Koch, Sabine; Grzeschik, Karl-Heinz

    2013-10-01

    The zinc finger transcription factor Gli3 is an important mediator of Sonic hedgehog (Shh) signaling. During early embryonic development Gli3 participates in patterning and growth of the central nervous system, face, skeleton, limb, tooth and gut. Precise regulation of the temporal and spatial expression of Gli3 is crucial for the proper specification of these structures in mammals and other vertebrates. Previously we reported a set of human intronic cis-regulators controlling almost the entire known repertoire of endogenous Gli3 expression in mouse neural tube and limbs. However, the genetic underpinning of GLI3 expression in other embryonic domains such as craniofacial structures and internal organs remain elusive. Here we demonstrate in a transgenic mice assay the potential of a subset of human/fish conserved non-coding sequences (CNEs) residing within GLI3 intronic intervals to induce reporter gene expression at known regions of endogenous Gli3 transcription in embryonic domains other than central nervous system (CNS) and limbs. Highly specific reporter expression was observed in craniofacial structures, eye, gut, and genitourinary system. Moreover, the comparison of expression patterns directed by these intronic cis-acting regulatory elements in mouse and zebrafish embryos suggests that in accordance with sequence conservation, the target site specificity of a subset of these elements remains preserved among these two lineages. Taken together with our recent investigations, it is proposed here that during vertebrate evolution the Gli3 expression control acquired multiple, independently acting, intronic enhancers for spatiotemporal patterning of CNS, limbs, craniofacial structures and internal organs.

  6. High constitutive activity of a broad panel of housekeeping and tissue-specific cis-regulatory elements depends on a subset of ETS proteins.

    PubMed

    Curina, Alessia; Termanini, Alberto; Barozzi, Iros; Prosperini, Elena; Simonatto, Marta; Polletti, Sara; Silvola, Alessio; Soldi, Monica; Austenaa, Liv; Bonaldi, Tiziana; Ghisletti, Serena; Natoli, Gioacchino

    2017-02-15

    Enhancers and promoters that control the transcriptional output of terminally differentiated cells include cell type-specific and broadly active housekeeping elements. Whether the high constitutive activity of these two groups of cis-regulatory elements relies on entirely distinct or instead also on shared regulators is unknown. By dissecting the cis-regulatory repertoire of macrophages, we found that the ELF subfamily of ETS proteins selectively bound within 60 base pairs (bp) from the transcription start sites of highly active housekeeping genes. ELFs also bound constitutively active, but not poised, macrophage-specific enhancers and promoters. The role of ELFs in promoting high-level constitutive transcription was suggested by multiple evidence: ELF sites enabled robust transcriptional activation by endogenous and minimal synthetic promoters, ELF recruitment was stabilized by the transcriptional machinery, and ELF proteins mediated recruitment of transcriptional and chromatin regulators to core promoters. These data suggest that the co-optation of a limited number of highly active transcription factors represents a broadly adopted strategy to equip both cell type-specific and housekeeping cis-regulatory elements with the ability to efficiently promote transcription.

  7. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  8. An arthropod cis-regulatory element functioning in sensory organ precursor development dates back to the Cambrian

    PubMed Central

    2010-01-01

    Background An increasing number of publications demonstrate conservation of function of cis-regulatory elements without sequence similarity. In invertebrates such functional conservation has only been shown for closely related species. Here we demonstrate the existence of an ancient arthropod regulatory element that functions during the selection of neural precursors. The activity of genes of the achaete-scute (ac-sc) family endows cells with neural potential. An essential, conserved characteristic of proneural genes is their ability to restrict their own activity to single or a small number of progenitor cells from their initially broad domains of expression. This is achieved through a process called lateral inhibition. A regulatory element, the sensory organ precursor enhancer (SOPE), is required for this process. First identified in Drosophila, the SOPE contains discrete binding sites for four regulatory factors. The SOPE of the Drosophila asense gene is situated in the 5' UTR. Results Through a manual comparison of consensus binding site sequences we have been able to identify a SOPE in UTR sequences of asense-like genes in species belonging to all four arthropod groups (Crustacea, Myriapoda, Chelicerata and Insecta). The SOPEs of the spider Cupiennius salei and the insect Tribolium castaneum are shown to be functional in transgenic Drosophila. This would place the origin of this regulatory sequence as far back as the last common ancestor of the Arthropoda, that is, in the Cambrian, 550 million years ago. Conclusions The SOPE is not detectable by inter-specific sequence comparison, raising the possibility that other ancient regulatory modules in invertebrates might have escaped detection. PMID:20868489

  9. Kinetics of transcription initiation directed by multiple cis-regulatory elements on the glnAp2 promoter

    PubMed Central

    Wang, Yaolai; Liu, Feng; Wang, Wei

    2016-01-01

    Transcription initiation is orchestrated by dynamic molecular interactions, with kinetic steps difficult to detect. Utilizing a hybrid method, we aim to unravel essential kinetic steps of transcriptional regulation on the glnAp2 promoter, whose regulatory region includes two enhancers (sites I and II) and three low-affinity sequences (sites III-V), to which the transcriptional activator NtrC binds. By structure reconstruction, we analyze all possible organization architectures of the transcription apparatus (TA). The main regulatory mode involves two NtrC hexamers: one at enhancer II transiently associates with site V such that the other at enhancer I can rapidly approach and catalyze the σ54-RNA polymerase holoenzyme. We build a kinetic model characterizing essential steps of the TA operation; with the known kinetics of the holoenzyme interacting with DNA, this model enables the kinetics beyond technical detection to be determined by fitting the input-output function of the wild-type promoter. The model further quantitatively reproduces transcriptional activities of various mutated promoters. These results reveal different roles played by two enhancers and interpret why the low-affinity elements conditionally enhance or repress transcription. This work presents an integrated dynamic picture of regulated transcription initiation and suggests an evolutionarily conserved characteristic guaranteeing reliable transcriptional response to regulatory signals. PMID:27899598

  10. RNA recognition motif 2 directs the recruitment of SF2/ASF to nuclear stress bodies

    PubMed Central

    Chiodi, Ilaria; Corioni, Margherita; Giordano, Manuela; Valgardsdottir, Rut; Ghigna, Claudia; Cobianchi, Fabio; Xu, Rui-Ming; Riva, Silvano; Biamonti, Giuseppe

    2004-01-01

    Heat shock induces the transcriptional activation of large heterochromatic regions of the human genome composed of arrays of satellite III DNA repeats. A number of RNA-processing factors, among them splicing factor SF2/ASF, associate with these transcription factors giving rise to nuclear stress bodies (nSBs). Here, we show that the recruitment of SF2/ASF to these structures is mediated by its second RNA recognition motif. Amino acid substitutions in the first α-helix of this domain, but not in the β-strand regions, abrogate the association with nSBs. The same mutations drastically affect the in vivo activity of SF2/ASF in the alternative splicing of adenoviral E1A transcripts. Sequence analysis identifies four putative high-affinity binding sites for SF2/ASF in the transcribed strand of the satellite III DNA. We have verified by gel mobility shift assays that the second RNA-binding domain of SF2/ASF binds at least one of these sites. Our analysis suggests that the recruitment of SF2/ASF to nSBs is mediated by a direct interaction with satellite III transcripts and points to the second RNA-binding domain of the protein as the major determinant of this interaction. PMID:15302913

  11. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring.

  12. Motif analysis in directed ordered networks and applications to food webs

    PubMed Central

    Paulau, Pavel V.; Feenders, Christoph; Blasius, Bernd

    2015-01-01

    The analysis of small recurrent substructures, so called network motifs, has become a standard tool of complex network science to unveil the design principles underlying the structure of empirical networks. In many natural systems network nodes are associated with an intrinsic property according to which they can be ordered and compared against each other. Here, we expand standard motif analysis to be able to capture the hierarchical structure in such ordered networks. Our new approach is based on the identification of all ordered 3-node substructures and the visualization of their significance profile. We present a technique to calculate the fine grained motif spectrum by resolving the individual members of isomorphism classes (sets of substructures formed by permuting node-order). We apply this technique to computer generated ensembles of ordered networks and to empirical food web data, demonstrating the importance of considering node order for food-web analysis. Our approach may not only be helpful to identify hierarchical patterns in empirical food webs and other natural networks, it may also provide the base for extending motif analysis to other types of multi-layered networks. PMID:26144248

  13. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php.

  14. The autoimmunity-associated BLK haplotype exhibits cis-regulatory effects on mRNA and protein expression that are prominently observed in B cells early in development

    PubMed Central

    Simpfendorfer, Kim R.; Olsson, Lina M.; Manjarrez Orduño, Nataly; Khalili, Houman; Simeone, Alyssa M.; Katz, Matthew S.; Lee, Annette T.; Diamond, Betty; Gregersen, Peter K.

    2012-01-01

    The gene B lymphocyte kinase (BLK) is associated with rheumatoid arthritis, systemic lupus erythematosus and several other autoimmune disorders. The disease risk haplotype is known to be associated with reduced expression of BLK mRNA transcript in human B cell lines; however, little is known about cis-regulation of BLK message or protein levels in native cell types. Here, we show that in primary human B lymphocytes, cis-regulatory effects of disease-associated single nucleotide polymorphisms in BLK are restricted to naïve and transitional B cells. Cis-regulatory effects are not observed in adult B cells in later stages of differentiation. Allelic expression bias was also identified in primary human T cells from adult peripheral and umbilical cord blood (UCB), thymus and tonsil, although mRNA levels were reduced compared with B cells. Allelic regulation of Blk expression at the protein level was confirmed in UCB B cell subsets by intracellular staining and flow cytometry. Blk protein expression in CD4+ and CD8+ T cells was documented by western blot analysis; however, differences in protein expression levels by BLK genotype were not observed in any T cell subset. Blk allele expression differences at the protein level are thus restricted to early B cells, indicating that the involvement of Blk in the risk for autoimmune disease likely acts during the very early stages of B cell development. PMID:22678060

  15. i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules.

    PubMed

    Herrmann, Carl; Van de Sande, Bram; Potier, Delphine; Aerts, Stein

    2012-08-01

    The field of regulatory genomics today is characterized by the generation of high-throughput data sets that capture genome-wide transcription factor (TF) binding, histone modifications, or DNAseI hypersensitive regions across many cell types and conditions. In this context, a critical question is how to make optimal use of these publicly available datasets when studying transcriptional regulation. Here, we address this question in Drosophila melanogaster for which a large number of high-throughput regulatory datasets are available. We developed i-cisTarget (where the 'i' stands for integrative), for the first time enabling the discovery of different types of enriched 'regulatory features' in a set of co-regulated sequences in one analysis, being either TF motifs or 'in vivo' chromatin features, or combinations thereof. We have validated our approach on 15 co-expressed gene sets, 21 ChIP data sets, 628 curated gene sets and multiple individual case studies, and show that meaningful regulatory features can be confidently discovered; that bona fide enhancers can be identified, both by in vivo events and by TF motifs; and that combinations of in vivo events and TF motifs further increase the performance of enhancer prediction.

  16. T versus D in the MTCXXC motif of copper transport proteins plays a role in directional metal transport.

    PubMed

    Niemiec, Moritz S; Dingeldein, Artur P G; Wittung-Stafshede, Pernilla

    2014-08-01

    To avoid toxicity and control levels of metal ions, organisms have developed specific metal transport systems. In humans, the cytoplasmic Cu chaperone Atox1 delivers Cu to metal-binding domains of ATP7A/B in the Golgi, for incorporation into Cu-dependent proteins. The Cu-binding motif in Atox1, as well as in target Cu-binding domains of ATP7A/B, consists of a MX1CXXC motif where X1 = T. The same motif, with X1 = D, is found in metal-binding domains of bacterial zinc transporters, such as ZntA. The Asp is proposed to stabilize divalent over monovalent metals in the binding site, although metal selectivity in vivo appears predominantly governed by protein-protein interactions. To probe the role of T versus D at the X1 position for Cu transfer in vitro, we created MDCXXC variants of Atox1 and the fourth metal-binding domain of ATP7B, WD4. We find that the mutants bind Cu like the wild-type proteins, but when mixed, in contrast to the wild-type pair, the mutant pair favors Cu-dependent hetero-dimers over directional Cu transport from Atox1 to WD4. Notably, both wild-type and mutant proteins can bind Zn in the absence of competing reducing agents. In presence of zinc, hetero-complexes are strongly favored for both protein pairs. We propose that T is conserved in this motif of Cu-transport proteins to promote directional metal transfer toward ATP7B, without formation of energetic sinks. The ability of both Atox1 and WD4 to bind zinc ions may not be a problem in vivo due to the presence of specific transport chains for Cu and Zn ions.

  17. Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.).

    PubMed

    Yin, Tao; Wu, Hanying; Zhang, Shanglong; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming; Liu, Jingmei

    2009-01-01

    A 1.8 kb 5'-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from -986 to -959 and from -472 to -424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative beta-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were approximately 10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves.

  18. Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish.

    PubMed

    Bessa, José; Tena, Juan J; de la Calle-Mustienes, Elisa; Fernández-Miñán, Ana; Naranjo, Silvia; Fernández, Almudena; Montoliu, Lluis; Akalin, Altuna; Lenhard, Boris; Casares, Fernando; Gómez-Skarmeta, José Luis

    2009-09-01

    The identification and characterization of the regulatory activity of genomic sequences is crucial for understanding how the information contained in genomes is translated into cellular function. The cis-regulatory sequences control when, where, and how much genes are transcribed and can activate (enhancers) or repress (silencers) gene expression. Here, we describe a novel Tol2 transposon-based vector for assessing enhancer activity in the zebrafish (Danio rerio). This Zebrafish Enhancer Detector (ZED) vector harbors several key improvements, among them a sensitive and specific minimal promoter chosen for optimal enhancer activity detection, insulator sequences to shield the minimal promoter from position effects, and a positive control for transgenesis. Additionally, we demonstrate that highly conserved noncoding sequences homologous between humans and zebrafish largely with enhancer activity largely retain their tissue-specific enhancer activity during vertebrate evolution. More strikingly, insulator sequences from mouse and chicken, but not conserved in zebrafish, maintain their insulator capacity when tested in this model.

  19. PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION

    PubMed Central

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-01-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases. PMID:23241390

  20. Revealing constitutively expressed resistance genes in Agrostis species using PCR-based motif-directed RNA fingerprinting.

    PubMed

    Budak, Hikmet; Su, Senem; Ergen, Neslihan

    2006-12-01

    Agrostis species are mainly used in athletic fields and golf courses. Their integrity is maintained by fungicides, which makes the development of disease-resistance varieties a high priority. However, there is a lack of knowledge about resistance (R) genes and their use for genetic improvement in Agrostis species. The objective of this study was to identify and clone constitutively expressed cDNAs encoding R gene-like (RGL) sequences from three Agrostis species (colonial bentgrass (A. capillaris L.), creeping bentgrass (A. stolonifera L.) and velvet bentgrass (A. canina L.)) by PCR-based motif-directed RNA fingerprinting towards relatively conserved nucleotide binding site (NBS) domains. Sixty-one constitutively expressed cDNA sequences were identified and characterized. Sequence analysis of ESTs and probable translation products revealed that RGLs are highly conserved among these three Agrostis species. Fifteen of them were shown to share conserved motifs found in other plant disease resistance genes such as MLA13, Xa1, YR6, YR23 and RPP5. The molecular evolutionary forces, analysed using the Ka/Ks ratio, reflected purifying selection both on NBS and leucine-rich repeat (LRR) intervening regions of discovered RGL sequences in these species. This study presents, for the first time, isolation and characterization of constitutively expressed RGL sequences from Agrostis species revealing the presence of TNL (TIR-NBS-LRR) type R genes in monocot plants. The characterized RGLs will further enhance knowledge on the molecular evolution of the R gene family in grasses.

  1. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline

    PubMed Central

    Landeen, Emily L.; Muirhead, Christina A.; Meiklejohn, Colin D.; Presgraves, Daven C.

    2016-01-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower—approximately 3-fold or more—for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  2. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline.

    PubMed

    Landeen, Emily L; Muirhead, Christina A; Wright, Lori; Meiklejohn, Colin D; Presgraves, Daven C

    2016-07-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower-approximately 3-fold or more-for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution.

  3. Application of the cis-regulatory region of a heat-shock protein 70 gene to heat-inducible gene expression in the ascidian Ciona intestinalis.

    PubMed

    Kawaguchi, Akane; Utsumi, Nanami; Morita, Maki; Ohya, Aya; Wada, Shuichi

    2015-01-01

    Temporally controlled induction of gene expression is a useful technique for analyzing gene function. To make such a technique possible in Ciona intestinalis embryos, we employed the cis-regulatory region of the heat-shock protein 70 (HSP70) gene Ci-HSPA1/6/7-like for heat-inducible gene expression in C. intestinalis embryos. We showed that Ci-HSPA1/6/7-like becomes heat shock-inducible by the 32-cell stage during embryogenesis. The 5'-upstream region of Ci-HSPA1/6/7-like, which contains heat-shock elements indispensable for heat-inducible gene expression, induces the heat shock-dependent expression of a reporter gene in the whole embryo from the 32-cell to the middle gastrula stages and in progressively restricted areas of embryos in subsequent stages. We assessed the effects of heat-shock treatments in different conditions on the normality of embryos and induction of transgene expression. We evaluated the usefulness of this technique through overexpression experiments on the well-characterized, developmentally relevant gene, Ci-Bra, and showed that this technique is applicable for inferring the gene function in C. intestinalis.

  4. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes.

    PubMed

    Uhl, Juli D; Zandvakili, Arya; Gebelein, Brian

    2016-04-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints.

  5. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes

    PubMed Central

    Uhl, Juli D.; Zandvakili, Arya; Gebelein, Brian

    2016-01-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  6. Identification and characterization of promoters and cis-regulatory elements of genes involved in secondary metabolites production in hop (Humulus lupulus. L).

    PubMed

    Duraisamy, Ganesh Selvaraj; Mishra, Ajay Kumar; Kocabek, Tomas; Matoušek, Jaroslav

    2016-10-01

    Molecular and biochemical studies have shown that gene contains single or combination of different cis-acting regulatory elements are actively controlling the transcriptional regulation of associated genes, downstream effects of these result in the modulation of various biological pathways such as biotic/abiotic stress responses, hormonal responses to growth and development processes and secondary metabolite production. Therefore, the identification of promoters and their cis-regulatory elements is one of intriguing area to study the dynamic complex regulatory network of genes activities by integrating computational, comparative, structural and functional genomics. Several bioinformatics servers or database have been established to predict the cis-acting elements present in the promoter region of target gene and their association with the expression profiles in the TFs. The aim of this study is to predict possible cis-acting regulatory elements that have putative role in the transcriptional regulation of a dynamic network of metabolite gene activities controlling prenylflavonoid and bitter acids biosynthesis in hop (Humulus lupulus). Recent release of hop draft genome enabled us to predict the possible cis-acting regulatory elements by extracting 2kbp of 5' regulatory regions of genes important for lupulin metabolome biosynthesis, using Plant CARE, PLACE and Genomatix Matinspector professional databases. The result reveals the plausible role of cis-acting regulatory elements in the regulation of gene expression primarily involved in lupulin metabolome biosynthesis including under various stress conditions.

  7. A cis-regulatory mutation in troponin-I of Drosophila reveals the importance of proper stoichiometry of structural proteins during muscle assembly.

    PubMed

    Firdaus, Hena; Mohan, Jayaram; Naz, Sarwat; Arathi, Prabhashankar; Ramesh, Saraf R; Nongthomba, Upendra

    2015-05-01

    Rapid and high wing-beat frequencies achieved during insect flight are powered by the indirect flight muscles, the largest group of muscles present in the thorax. Any anomaly during the assembly and/or structural impairment of the indirect flight muscles gives rise to a flightless phenotype. Multiple mutagenesis screens in Drosophila melanogaster for defective flight behavior have led to the isolation and characterization of mutations that have been instrumental in the identification of many proteins and residues that are important for muscle assembly, function, and disease. In this article, we present a molecular-genetic characterization of a flightless mutation, flightless-H (fliH), originally designated as heldup-a (hdp-a). We show that fliH is a cis-regulatory mutation of the wings up A (wupA) gene, which codes for the troponin-I protein, one of the troponin complex proteins, involved in regulation of muscle contraction. The mutation leads to reduced levels of troponin-I transcript and protein. In addition to this, there is also coordinated reduction in transcript and protein levels of other structural protein isoforms that are part of the troponin complex. The altered transcript and protein stoichiometry ultimately culminates in unregulated acto-myosin interactions and a hypercontraction muscle phenotype. Our results shed new insights into the importance of maintaining the stoichiometry of structural proteins during muscle assembly for proper function with implications for the identification of mutations and disease phenotypes in other species, including humans.

  8. Computational identification and functional validation of regulatory motifs in cartilage-expressed genes

    PubMed Central

    Davies, Sherri R.; Chang, Li-Wei; Patra, Debabrata; Xing, Xiaoyun; Posey, Karen; Hecht, Jacqueline; Stormo, Gary D.; Sandell, Linda J.

    2007-01-01

    Chondrocyte gene regulation is important for the generation and maintenance of cartilage tissues. Several regulatory factors have been identified that play a role in chondrogenesis, including the positive transacting factors of the SOX family such as SOX9, SOX5, and SOX6, as well as negative transacting factors such as C/EBP and delta EF1. However, a complete understanding of the intricate regulatory network that governs the tissue-specific expression of cartilage genes is not yet available. We have taken a computational approach to identify cis-regulatory, transcription factor (TF) binding motifs in a set of cartilage characteristic genes to better define the transcriptional regulatory networks that regulate chondrogenesis. Our computational methods have identified several TFs, whose binding profiles are available in the TRANSFAC database, as important to chondrogenesis. In addition, a cartilage-specific SOX-binding profile was constructed and used to identify both known, and novel, functional paired SOX-binding motifs in chondrocyte genes. Using DNA pattern-recognition algorithms, we have also identified cis-regulatory elements for unknown TFs. We have validated our computational predictions through mutational analyses in cell transfection experiments. One novel regulatory motif, N1, found at high frequency in the COL2A1 promoter, was found to bind to chondrocyte nuclear proteins. Mutational analyses suggest that this motif binds a repressive factor that regulates basal levels of the COL2A1 promoter. PMID:17785538

  9. Human ADA3 regulates RARα transcriptional activity through direct contact between LxxLL motifs and the receptor coactivator pocket

    PubMed Central

    Li, Chia-Wei; Ai, Ni; Dinh, Gia Khanh; Welsh, William J.; Chen, J. Don

    2010-01-01

    The alternation/deficiency in activation-3 (ADA3) is an essential component of the human p300/CBP-associated factor (PCAF) and yeast Spt-Ada-Gcn5-acetyltransferase (SAGA) histone acetyltransferase complexes. These complexes facilitate transactivation of target genes by association with transcription factors and modification of local chromatin structure. It is known that the yeast ADA3 is required for nuclear receptor (NR)-mediated transactivation in yeast cells; however, the role of mammalian ADA3 in NR signaling remains elusive. In this study, we have investigated how the human (h) ADA3 regulates retinoic acid receptor (RAR) α-mediated transactivation. We show that hADA3 interacts directly with RARα in a hormone-dependent manner and this interaction contributes to RARα transactivation. Intriguingly, this interaction involves classical LxxLL motifs in hADA3, as demonstrated by both ‘loss’ and ‘gain’ of function mutations, as well as a functional coactivator pocket of the receptor. Additionally, we show that hADA3 associates with RARα target gene promoter in a hormone-dependent manner and ADA3 knockdown impairs RARβ2 expression. Furthermore, a structural model was established to illustrate an interaction network within the ADA3/RARα complex. These results suggest that hADA3 is a bona fide transcriptional coactivator for RARα, acting through a conserved mechanism involving direct contacts between NR boxes and the receptor’s co-activator pocket. PMID:20413580

  10. Direct Imaging of Hippocampal Epileptiform Calcium Motifs Following Kainic Acid Administration in Freely Behaving Mice

    PubMed Central

    Berdyyeva, Tamara K.; Frady, E. Paxon; Nassi, Jonathan J.; Aluisio, Leah; Cherkas, Yauheniya; Otte, Stephani; Wyatt, Ryan M.; Dugovic, Christine; Ghosh, Kunal K.; Schnitzer, Mark J.; Lovenberg, Timothy; Bonaventure, Pascal

    2016-01-01

    Prolonged exposure to abnormally high calcium concentrations is thought to be a core mechanism underlying hippocampal damage in epileptic patients; however, no prior study has characterized calcium activity during seizures in the live, intact hippocampus. We have directly investigated this possibility by combining whole-brain electroencephalographic (EEG) measurements with microendoscopic calcium imaging of pyramidal cells in the CA1 hippocampal region of freely behaving mice treated with the pro-convulsant kainic acid (KA). We observed that KA administration led to systematic patterns of epileptiform calcium activity: a series of large-scale, intensifying flashes of increased calcium fluorescence concurrent with a cluster of low-amplitude EEG waveforms. This was accompanied by a steady increase in cellular calcium levels (>5 fold increase relative to the baseline), followed by an intense spreading calcium wave characterized by a 218% increase in global mean intensity of calcium fluorescence (n = 8, range [114–349%], p < 10−4; t-test). The wave had no consistent EEG phenotype and occurred before the onset of motor convulsions. Similar changes in calcium activity were also observed in animals treated with 2 different proconvulsant agents, N-methyl-D-aspartate (NMDA) and pentylenetetrazol (PTZ), suggesting the measured changes in calcium dynamics are a signature of seizure activity rather than a KA-specific pathology. Additionally, despite reducing the behavioral severity of KA-induced seizures, the anticonvulsant drug valproate (VA, 300 mg/kg) did not modify the observed abnormalities in calcium dynamics. These results confirm the presence of pathological calcium activity preceding convulsive motor seizures and support calcium as a candidate signaling molecule in a pathway connecting seizures to subsequent cellular damage. Integrating in vivo calcium imaging with traditional assessment of seizures could potentially increase translatability of pharmacological

  11. Vertebrate mRNAs with a 5'-terminal pyrimidine tract are candidates for translational repression in quiescent cells: characterization of the translational cis-regulatory element.

    PubMed Central

    Avni, D; Shama, S; Loreni, F; Meyuhas, O

    1994-01-01

    The translation of mammalian ribosomal protein (rp) mRNAs is selectively repressed in nongrowing cells. This response is mediated through a regulatory element residing in the 5' untranslated region of these mRNAs and includes a 5' terminal oligopyrimidine tract (5' TOP). To further characterize the translational cis-regulatory element, we monitored the translational behavior of various endogenous and heterologous mRNAs or hybrid transcripts derived from transfected chimeric genes. The translational efficiency of these mRNAs was assessed in cells that either were growing normally or were growth arrested under various physiological conditions. Our experiments have yielded the following results: (i) the translation of mammalian rp mRNAs is properly regulated in amphibian cells, and likewise, amphibian rp mRNA is regulated in mammalian cells, indicating that all of the elements required for translation control of rp mRNAs are conserved among vertebrate classes; (ii) selective translational control is not confined to rp mRNAs, as mRNAs encoding the naturally occurring ubiquitin-rp fusion protein and elongation factor 1 alpha, which contain a 5' TOP, also conform this mode of regulation; (iii) rat rpP2 mRNA contains only five pyrimidines in its 5' TOP, yet this mRNA is translationally controlled in the same fashion as other rp mRNAs with a 5' TOP of eight or more pyrimidines; (iv) full manifestation of this mode of regulation seems to require both the 5' TOP and sequences immediately downstream; and (v) an intact translational regulatory element from rpL32 mRNA fails to exert its regulatory properties even when preceded by a single A residue. Images PMID:8196625

  12. Specific binding of the replication protein of plasmid pPS10 to direct and inverted repeats is mediated by an HTH motif.

    PubMed Central

    García de Viedma, D; Serrano-López, A; Díaz-Orejas, R

    1995-01-01

    The initiator protein of the plasmid pPS10, RepA, has a putative helix-turn-helix (HTH) motif at its C-terminal end. RepA dimers bind to an inverted repeat at the repA promoter (repAP) to autoregulate RepA synthesis. [D. García de Viedma, et al. (1996) EMBO J. in press]. RepA monomers bind to four direct repeats at the origin of replication (oriV) to initiate pPS10 replication This report shows that randomly generated mutations in RepA, associated with defficiencies in autoregulation, map either at the putative HTH motif or in its vicinity. These mutant proteins do not promote pPS10 replication and are severely affected in binding to both the repAP and oriV regions in vitro. Revertants of a mutant that map in the vicinity of the HTH motif have been obtained and correspond to a second amino acid substitution far upstream of the motif. However, reversion of mutants that map in the helices of the motif occurs less frequently, at least by an order of magnitude. All these data indicate that the helices of the HTH motif play an essential role in specific RepA-DNA interactions, although additional regions also seem to be involved in DNA binding activity. Some mutations have slightly different effects in replication and autoregulation, suggesting that the role of the HTH motif in the interaction of RepA dimers or monomers with their respective DNA targets (IR or DR) is not the same. Images PMID:8559664

  13. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site

    PubMed Central

    Wongsantichon, Jantana; Robinson, Robert C.; Ketterman, Albert J.

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  14. Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial Model

    PubMed Central

    Halfon, Marc S.; Grad, Yonatan; Church, George M.; Michelson, Alan M.

    2002-01-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development. [Sequence data reported in this paper have been deposited in GenBank with accession nos. AF513981 (Eve MHE) and AF513982 (Hbr DME). Supplementary material is available online at http://www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: R. Blackman] PMID:12097338

  15. Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

    PubMed Central

    Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

    2014-01-01

    Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522

  16. Developmental appearance of factors that bind specifically to cis-regulatory sequences of a gene expressed in the sea urchin embryo.

    PubMed

    Calzone, F J; Thézé, N; Thiebaud, P; Hill, R L; Britten, R J; Davidson, E H

    1988-09-01

    Previous gene-transfer experiments have identified a 2500-nucleotide 5' domain of the CyIIIa cytoskeletal actin gene, which contains cis-regulatory sequences that are necessary and sufficient for spatial and temporal control of CyIIIa gene expression during embryogenesis. This gene is activated in late cleavage, exclusively in aboral ectoderm cell lineages. In this study, we focus on interactions demonstrated in vitro between sequences of the regulatory domain and proteins present in crude extracts derived from sea urchin embryo nuclei and from unfertilized eggs. Quantitative gel-shift measurements are utilized to estimate minimum numbers of factor molecules per embryo at 24 hr postfertilization, when the CyIIIa gene is active, at 7 hr, when it is still silent, and in the unfertilized egg. We also estimate the binding affinity preferences (Kr) of the various factors for their respective sites, relative to their affinity for synthetic DNA competitors. At least 14 different specific interactions occur within the regulatory regions, some of which produce multiple DNA-protein complexes. Values of Kr range from approximately 2 x 10(4) to approximately 2 x 10(6) for these factors under the conditions applied. With one exception, the minimum factor prevalences that we measured in the 400-cell 24-hr embryo nuclear extracts fell within the range of 2 x 10(5) to 2 x 10(6) molecules per embryo, i.e., a few hundred to a few thousand molecules per nucleus. Three developmental patterns were observed with respect to factor prevalence: Factors reacting at one site were found in unfertilized egg cytoplasm at about the same level per egg or embryo as in 24-hr embryo nuclei; factors reacting with five other regions of the regulatory domain are not detectable in egg cytoplasm but in 7-hr mid-cleavage-stage embryo, nuclei are already at or close to their concentrations in the 24-hr embryo nuclei; and factors reacting with five additional regions are not detectable in egg cytoplasm and

  17. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

    PubMed Central

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-01-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . PMID:15980501

  18. Exploiting powder X-ray diffraction for direct structure determination in structural biology: the P2X4 receptor trafficking motif YEQGL.

    PubMed

    Fujii, Kotaro; Young, Mark T; Harris, Kenneth D M

    2011-06-01

    We report the crystal structure of the 5-residue peptide acetyl-YEQGL-amide, determined directly from powder X-ray diffraction data recorded on a conventional laboratory X-ray powder diffractometer. The YEQGL motif has a known biological role, as a trafficking motif in the C-terminus of mammalian P2X4 receptors. Comparison of the crystal structure of acetyl-YEQGL-amide determined here and that of a complex formed with the μ2 subunit of the clathrin adaptor protein complex AP2 reported previously, reveals differences in conformational properties, although there are nevertheless similarities concerning aspects of the hydrogen-bonding arrangement and the hydrophobic environment of the leucine sidechain. Our results demonstrate the potential for exploiting modern powder X-ray diffraction methodology to achieve complete structure determination of materials of biological interest that do not crystallize as single crystals of suitable size and quality for single-crystal X-ray diffraction.

  19. The Significance of Multivalent Bonding Motifs and “Bond Order” in DNA-Directed Nanoparticle Crystallization

    SciTech Connect

    Thaner, Ryan V.; Eryazici, Ibrahim; Macfarlane, Robert J.; Brown, Keith A.; Lee, Byeongdu; Nguyen, SonBinh T.; Mirkin, Chad A.

    2016-05-18

    Multivalent oligonucleotide-based bonding elements have been synthesized and studied for the assembly and crystallization of gold nanoparticles. Through the use of organic branching points, divalent and trivalent DNA linkers were readily incorporated into the oligonucleotide shells that define DNA-nanoparticles and compared to monovalent linker systems. These multivalent bonding motifs enable the change of "bond strength" between particles and therefore modulate the effective "bond order." In addition, the improved accessibility of strands between neighboring particles, either due to multivalency or modifications to increase strand flexibility, gives rise to superlattices with less strain in the crystallites compared to traditional designs. Furthermore, the increased availability and number of binding modes also provide a new variable that allows previously unobserved crystal structures to be synthesized, as evidenced by the formation of a thorium phosphide superlattice.

  20. Unique ζ-chain motifs mediate a direct TCR-actin linkage critical for immunological synapse formation and T-cell activation.

    PubMed

    Klieger, Yair; Almogi-Hazan, Osnat; Ish-Shalom, Eliran; Pato, Aviad; Pauker, Maor H; Barda-Saad, Mira; Wang, Lynn; Baniyash, Michal

    2014-01-01

    TCR-mediated activation induces receptor microclusters that evolve to a defined immune synapse (IS). Many studies showed that actin polymerization and remodeling, which create a scaffold critical to IS formation and stabilization, are TCR mediated. However, the mechanisms controlling simultaneous TCR and actin dynamic rearrangement in the IS are yet not fully understood. Herein, we identify two novel TCR ζ-chain motifs, mediating the TCR's direct interaction with actin and inducing actin bundling. While T cells expressing the ζ-chain mutated in these motifs lack cytoskeleton (actin) associated (cska)-TCRs, they express normal levels of non-cska and surface TCRs as cells expressing wild-type ζ-chain. However, such mutant cells are unable to display activation-dependent TCR clustering, IS formation, expression of CD25/CD69 activation markers, or produce/secrete cytokine, effects also seen in the corresponding APCs. We are the first to show a direct TCR-actin linkage, providing the missing gap linking between TCR-mediated Ag recognition, specific cytoskeleton orientation toward the T-cell-APC interacting pole and long-lived IS maintenance.

  1. A novel sorting motif in the glutamate transporter excitatory amino acid transporter 3 directs its targeting in Madin-Darby canine kidney cells and hippocampal neurons.

    PubMed

    Cheng, Chialin; Glover, Greta; Banker, Gary; Amara, Susan G

    2002-12-15

    The glutamate transporter excitatory amino acid transporter 3 (EAAT3) is polarized to the apical surface in epithelial cells and localized to the dendritic compartment in hippocampal neurons, where it is clustered adjacent to postsynaptic sites. In this study, we analyzed the sequences in EAAT3 that are responsible for its polarized localization in Madin-Darby canine kidney (MDCK) cells and neurons. Confocal microscopy and cell surface biotinylation assays demonstrated that deletion of the EAAT3 C terminus or replacement of the C terminus of EAAT3 with the analogous region in EAAT1 eliminated apical localization in MDCK cells. The C terminus of EAAT3 was sufficient to redirect the basolateral-preferring EAAT1 and the nonpolarized EAAT2 to the apical surface. Using alanine substitution mutants, we identified a short peptide motif in the cytoplasmic C-terminal region of EAAT3 that directs its apical localization in MDCK cells. Mutation of this sequence also impairs dendritic targeting of EAAT3 in hippocampal neurons but does not interfere with the clustering of EAAT3 on dendritic spines and filopodia. These data provide the first evidence that an identical cytoplasmic motif can direct apical targeting in epithelia and somatodendritic targeting in neurons. Moreover, our results demonstrate that the two fundamental features of the localization of EAAT3 in neurons, its restriction to the somatodendritic domain and its clustering near postsynaptic sites, are mediated by distinct molecular mechanisms.

  2. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  3. Carbonyl-carbonyl interactions and amide π-stacking as the directing motifs of the supramolecular assembly of ethyl N-(2-acetylphenyl)oxalamate in a synperiplanar conformation.

    PubMed

    Cabrera-Pérez, Laura C; García-Báez, Efrén V; Franco-Hernández, Marina O; Martínez-Martínez, Francisco J; Padilla-Martínez, Itzia I

    2015-05-01

    The title compound, C12H13NO4, is one of the few examples that exhibits a syn conformation between the amide and ester carbonyl groups of the oxalyl group. This conformation allows the engagement of the amide H atom in an intramolecular three-centred hydrogen-bonding S(6)S(5) motif. The compound is self-assembled by C=O...C=O and amide-π interactions into stacked columns along the b-axis direction. The concurrence of both interactions seems to be responsible for stabilizing the observed syn conformation between the carbonyl groups. The second dimension, along the a-axis direction, is developed by soft C-H...O hydrogen bonding. Density functional theory (DFT) calculations at the B3LYP/6-31G(d,p) level of theory were performed to support the experimental findings.

  4. QGRS-H Predictor: a web server for predicting homologous quadruplex forming G-rich sequence motifs in nucleotide sequences

    PubMed Central

    Menendez, Camille; Frees, Scott; Bagga, Paramjeet S.

    2012-01-01

    Naturally occurring G-quadruplex structural motifs, formed by guanine-rich nucleic acids, have been reported in telomeric, promoter and transcribed regions of mammalian genomes. G-quadruplex structures have received significant attention because of growing evidence for their role in important biological processes, human disease and as therapeutic targets. Lately, there has been much interest in the potential roles of RNA G-quadruplexes as cis-regulatory elements of post-transcriptional gene expression. Large-scale computational genomics studies on G-quadruplexes have difficulty validating their predictions without laborious testing in ‘wet’ labs. We have developed a bioinformatics tool, QGRS-H Predictor that can map and analyze conserved putative Quadruplex forming 'G'-Rich Sequences (QGRS) in mRNAs, ncRNAs and other nucleotide sequences, e.g. promoter, telomeric and gene flanking regions. Identifying conserved regulatory motifs helps validate computations and enhances accuracy of predictions. The QGRS-H Predictor is particularly useful for mapping homologous G-quadruplex forming sequences as cis-regulatory elements in the context of 5′- and 3′-untranslated regions, and CDS sections of aligned mRNA sequences. QGRS-H Predictor features highly interactive graphic representation of the data. It is a unique and user-friendly application that provides many options for defining and studying G-quadruplexes. The QGRS-H Predictor can be freely accessed at: http://quadruplex.ramapo.edu/qgrs/app/start. PMID:22576365

  5. Control of recombination directionality by the Listeria phage A118 protein Gp44 and the coiled-coil motif of its serine integrase.

    PubMed

    Mandali, Sridhar; Gupta, Kushol; Dawson, Anthony R; Van Duyne, Gregory D; Johnson, Reid C

    2017-03-13

    The serine integrase of phage A118 catalyzes integrative recombination between attP on the phage and a specific attB locus on the chromosome of Listeriamonocytogenes but is unable to promote excisive recombination between the hybrid attL and attR sites found on the integrated prophage without assistance from a Recombination Directionality Factor (RDF). We have identified and characterized the phage-encoded RDF, Gp44, which activates the A118 integrase for excision and inhibits integration. Gp44 binds to the C-terminal DNA binding domain of integrase, and we have localized the primary binding site to be within the mobile coiled-coil (CC) motif but distinct from the distal tip of the CC that is required for recombination. This interaction is sufficient to inhibit integration, but a second interaction involving the N-terminal end of Gp44 is also required to activate excision. We provide evidence that these two contacts modulate the trajectory of the CC motifs as they extend out from the integrase core in a manner dependent upon the identity of the four att sites. Our results support a model whereby Gp44 shapes the Int-bound complexes to control which att sites can synapse and recombine.IMPORTANCE Serine integrases mediate directional recombination between bacteriophage and bacterial chromosomes. These highly regulated site-specific recombination reactions are integral to the life cycle of temperate phage, and in the case of Listeria monocytogenes lysogenized by A118-family phage, are an essential virulence determinant. Serine integrases are also utilized as tools for genetic engineering and synthetic biology because of their exquisite unidirectional control of the DNA exchange reaction. Here we identify and characterize the Recombination Directionality Factor (RDF) that activates excision and inhibits integration reactions by the phage A118 integrase. We provide evidence that the A118 RDF binds to and modulates the trajectory of the long coiled-coil motif that extends

  6. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.

  7. Redundant ERF-VII Transcription Factors Bind to an Evolutionarily Conserved cis-Motif to Regulate Hypoxia-Responsive Gene Expression in Arabidopsis

    PubMed Central

    Gasch, Philipp; Fundinger, Moritz; Müller, Jana T.; Lee, Travis; Mustroph, Angelika

    2016-01-01

    The response of Arabidopsis thaliana to low-oxygen stress (hypoxia), such as during shoot submergence or root waterlogging, includes increasing the levels of ∼50 hypoxia-responsive gene transcripts, many of which encode enzymes associated with anaerobic metabolism. Upregulation of over half of these mRNAs involves stabilization of five group VII ethylene response factor (ERF-VII) transcription factors, which are routinely degraded via the N-end rule pathway of proteolysis in an oxygen- and nitric oxide-dependent manner. Despite their importance, neither the quantitative contribution of individual ERF-VIIs nor the cis-regulatory elements they govern are well understood. Here, using single- and double-null mutants, the constitutively synthesized ERF-VIIs RELATED TO APETALA2.2 (RAP2.2) and RAP2.12 are shown to act redundantly as principle activators of hypoxia-responsive genes; constitutively expressed RAP2.3 contributes to this redundancy, whereas the hypoxia-induced HYPOXIA RESPONSIVE ERF1 (HRE1) and HRE2 play minor roles. An evolutionarily conserved 12-bp cis-regulatory motif that binds to and is sufficient for activation by RAP2.2 and RAP2.12 is identified through a comparative phylogenetic motif search, promoter dissection, yeast one-hybrid assays, and chromatin immunopurification. This motif, designated the hypoxia-responsive promoter element, is enriched in promoters of hypoxia-responsive genes in multiple species. PMID:26668304

  8. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements

    PubMed Central

    Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs. PMID:26760036

  9. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements.

    PubMed

    Wang, Hua; Liu, Chunmei; Cheng, Jingfei; Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs.

  10. Nucleosomes, Linker DNA, and Linker Histone form a Unique Structural Motif that Directs the Higher-Order Folding and Compaction of Chromatin

    NASA Astrophysics Data System (ADS)

    Bednar, Jan; Horowitz, Rachel A.; Grigoryev, Sergei A.; Carruthers, Lenny M.; Hansen, Jeffrey C.; Koster, Abraham J.; Woodcock, Christopher L.

    1998-11-01

    The compaction level of arrays of nucleosomes may be understood in terms of the balance between the self-repulsion of DNA (principally linker DNA) and countering factors including the ionic strength and composition of the medium, the highly basic N termini of the core histones, and linker histones. However, the structural principles that come into play during the transition from a loose chain of nucleosomes to a compact 30-nm chromatin fiber have been difficult to establish, and the arrangement of nucleosomes and linker DNA in condensed chromatin fibers has never been fully resolved. Based on images of the solution conformation of native chromatin and fully defined chromatin arrays obtained by electron cryomicroscopy, we report a linker histone-dependent architectural motif beyond the level of the nucleosome core particle that takes the form of a stem-like organization of the entering and exiting linker DNA segments. DNA completes ≈ 1.7 turns on the histone octamer in the presence and absence of linker histone. When linker histone is present, the two linker DNA segments become juxtaposed ≈ 8 nm from the nucleosome center and remain apposed for 3-5 nm before diverging. We propose that this stem motif directs the arrangement of nucleosomes and linker DNA within the chromatin fiber, establishing a unique three-dimensional zigzag folding pattern that is conserved during compaction. Such an arrangement with peripherally arranged nucleosomes and internal linker DNA segments is fully consistent with observations in intact nuclei and also allows dramatic changes in compaction level to occur without a concomitant change in topology.

  11. Incorporating Motif Analysis into Gene Co-expression Networks Reveals Novel Modular Expression Pattern and New Signaling Pathways

    PubMed Central

    Ma, Shisong; Shah, Smit; Bohnert, Hans J.; Snyder, Michael; Dinesh-Kumar, Savithramma P.

    2013-01-01

    Understanding of gene regulatory networks requires discovery of expression modules within gene co-expression networks and identification of promoter motifs and corresponding transcription factors that regulate their expression. A commonly used method for this purpose is a top-down approach based on clustering the network into a range of densely connected segments, treating these segments as expression modules, and extracting promoter motifs from these modules. Here, we describe a novel bottom-up approach to identify gene expression modules driven by known cis-regulatory motifs in the gene promoters. For a specific motif, genes in the co-expression network are ranked according to their probability of belonging to an expression module regulated by that motif. The ranking is conducted via motif enrichment or motif position bias analysis. Our results indicate that motif position bias analysis is an effective tool for genome-wide motif analysis. Sub-networks containing the top ranked genes are extracted and analyzed for inherent gene expression modules. This approach identified novel expression modules for the G-box, W-box, site II, and MYB motifs from an Arabidopsis thaliana gene co-expression network based on the graphical Gaussian model. The novel expression modules include those involved in house-keeping functions, primary and secondary metabolism, and abiotic and biotic stress responses. In addition to confirmation of previously described modules, we identified modules that include new signaling pathways. To associate transcription factors that regulate genes in these co-expression modules, we developed a novel reporter system. Using this approach, we evaluated MYB transcription factor-promoter interactions within MYB motif modules. PMID:24098147

  12. An Intronic cis-Regulatory Element Is Crucial for the Alpha Tubulin Pl-Tuba1a Gene Activation in the Ciliary Band and Animal Pole Neurogenic Domains during Sea Urchin Development

    PubMed Central

    Cuttitta, Angela; Gianguzza, Fabrizio; Ragusa, Maria Antonietta

    2017-01-01

    In sea urchin development, structures derived from neurogenic territory control the swimming and feeding responses of the pluteus as well as the process of metamorphosis. We have previously isolated an alpha tubulin family member of Paracentrotus lividus (Pl-Tuba1a, formerly known as Pl-Talpha2) that is specifically expressed in the ciliary band and animal pole neurogenic domains of the sea urchin embryo. In order to identify cis-regulatory elements controlling its spatio-temporal expression, we conducted gene transfer experiments, transgene deletions and site specific mutagenesis. Thus, a genomic region of about 2.6 Kb of Pl-Tuba1a, containing four Interspecifically Conserved Regions (ICRs), was identified as responsible for proper gene expression. An enhancer role was ascribed to ICR1 and ICR2, while ICR3 exerted a pivotal role in basal expression, restricting Tuba1a expression to the proper territories of the embryo. Additionally, the mutation of the forkhead box consensus sequence binding site in ICR3 prevented Pl-Tuba1a expression. PMID:28141828

  13. Drosophila melanogaster Hox Transcription Factors Access the RNA Polymerase II Machinery through Direct Homeodomain Binding to a Conserved Motif of Mediator Subunit Med19

    PubMed Central

    Boube, Muriel; Hudry, Bruno; Immarigeon, Clément; Carrier, Yannick; Bernat-Fabre, Sandra; Merabet, Samir; Graba, Yacine; Bourbon, Henri-Marc; Cribbs, David L.

    2014-01-01

    Hox genes in species across the metazoa encode transcription factors (TFs) containing highly-conserved homeodomains that bind target DNA sequences to regulate batteries of developmental target genes. DNA-bound Hox proteins, together with other TF partners, induce an appropriate transcriptional response by RNA Polymerase II (PolII) and its associated general transcription factors. How the evolutionarily conserved Hox TFs interface with this general machinery to generate finely regulated transcriptional responses remains obscure. One major component of the PolII machinery, the Mediator (MED) transcription complex, is composed of roughly 30 protein subunits organized in modules that bridge the PolII enzyme to DNA-bound TFs. Here, we investigate the physical and functional interplay between Drosophila melanogaster Hox developmental TFs and MED complex proteins. We find that the Med19 subunit directly binds Hox homeodomains, in vitro and in vivo. Loss-of-function Med19 mutations act as dose-sensitive genetic modifiers that synergistically modulate Hox-directed developmental outcomes. Using clonal analysis, we identify a role for Med19 in Hox-dependent target gene activation. We identify a conserved, animal-specific motif that is required for Med19 homeodomain binding, and for activation of a specific Ultrabithorax target. These results provide the first direct molecular link between Hox homeodomain proteins and the general PolII machinery. They support a role for Med19 as a PolII holoenzyme-embedded “co-factor” that acts together with Hox proteins through their homeodomains in regulated developmental transcription. PMID:24786462

  14. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development.

  15. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules

    PubMed Central

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C.; Hoppler, Stefan

    2016-01-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  16. Selection on cis-Regulatory Variation at B4galnt2 and Its Influence on von Willebrand Factor in House Mice

    PubMed Central

    Johnsen, Jill M.; Teschke, Meike; Pavlidis, Pavlos; McGee, Beth M.; Tautz, Diethard; Baines, John F.

    2009-01-01

    The RIIIS/J inbred mouse strain is a model for type 1 von Willebrand disease (VWD), a common human bleeding disorder. Low von Willebrand factor (VWF) levels in RIIIS/J are due to a regulatory mutation, Mvwf1, which directs a tissue-specific switch in expression of a glycosyltransferase, B4GALNT2, from intestine to blood vessel. We recently found that Mvwf1 lies on a founder allele common among laboratory mouse strains. To investigate the evolutionary forces operating at B4galnt2, we conducted a survey of DNA sequence polymorphism and microsatellite variation spanning the B4galnt2 gene region in natural Mus musculus domesticus populations. Two divergent haplotypes segregate in these natural populations, one of which corresponds to the RIIIS/J sequence. Different local populations display dramatic differences in the frequency of these haplotypes, and reduced microsatellite variability near B4galnt2 within the RIIIS/J haplotype is consistent with the recent action of natural selection. The level and pattern of DNA sequence polymorphism in the 5′ flanking region of the gene significantly deviates from the neutral expectation and suggests that variation in B4galnt2 expression may be under balancing selection and/or arose from a recently introgressed allele that subsequently increased in frequency due to natural selection. However, coalescent simulations indicate that the heterogeneity in divergence between haplotypes is greater than expected under an introgression model. Analysis of a population where the RIIIS/J haplotype is in high frequency reveals an association between this haplotype, the B4galnt2 tissue-specific switch, and a significant decrease in plasma VWF levels. Given these observations, we propose that low VWF levels may represent a fitness cost that is offset by a yet unknown benefit of the B4galnt2 tissue-specific switch. Similar mechanisms may account for the variability in VWF levels and high prevalence of VWD in other mammals, including humans. PMID

  17. Characterization of the human lipoprotein lipase (LPL) promoter: evidence of two cis-regulatory regions, LP-alpha and LP-beta, of importance for the differentiation-linked induction of the LPL gene during adipogenesis.

    PubMed Central

    Enerbäck, S; Ohlsson, B G; Samuelsson, L; Bjursell, G

    1992-01-01

    When preadipocytes differentiate into adipocytes, several differentiation-linked genes are activated. Lipoprotein lipase (LPL) is one of the first genes induced during this process. To investigate early events in adipocyte development, we have focused on the transcriptional activation of the LPL gene. For this purpose, we have cloned and fused different parts of intragenic and flanking sequences with a chloramphenicol acetyltransferase reporter gene. Transient transfection experiments and DNase I hypersensitivity assays indicate that several positive as well as negative elements contribute to transcriptional regulation of the LPL gene. When reporter gene constructs were stably introduced into preadipocytes, we were able to monitor and compare the activation patterns of different promoter deletion mutants at selected time points representing the process of adipocyte development. We could delimit two cis-regulatory elements important for gradual activation of the LPL gene during adipocyte development in vitro. These elements, LP-alpha (-702 to -666) and LP-beta (-468 to -430), contain a striking similarity to a consensus sequence known to bind the transcription factors HNF-3 and fork head. Results of gel mobility shift assays and DNase I and exonuclease III in vitro protection assays indicate that factors with DNA-binding properties similar to those of the HNF-3/fork head family of transcription factors are present in adipocytes and interact with LP-alpha and LP-beta. We also demonstrate that LP-alpha and LP-beta were both capable of conferring a differentiation-linked expression pattern to a heterolog promoter, thus mimicking the expression of the endogenous LPL gene during adipocyte differentiation. These findings indicate that interactions with LP-alpha and LP-beta could be a part of a differentiation switch governing induction of the LPL gene during adipocyte differentiation. Images PMID:1406652

  18. PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling.

    PubMed

    Siddharthan, Rahul

    2008-08-29

    PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules-tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other "discriminative motif-finders" have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use "informative priors" on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data.

  19. DISCOVER: a feature-based discriminative method for motif search in complex genomes

    PubMed Central

    Fu, Wenjie; Ray, Pradipta; Xing, Eric P.

    2009-01-01

    Motivation: Identifying transcription factor binding sites (TFBSs) encoding complex regulatory signals in metazoan genomes remains a challenging problem in computational genomics. Due to degeneracy of nucleotide content among binding site instances or motifs, and intricate ‘grammatical organization’ of motifs within cis-regulatory modules (CRMs), extant pattern matching-based in silico motif search methods often suffer from impractically high false positive rates, especially in the context of analyzing large genomic datasets, and noisy position weight matrices which characterize binding sites. Here, we try to address this problem by using a framework to maximally utilize the information content of the genomic DNA in the region of query, taking cues from values of various biologically meaningful genetic and epigenetic factors in the query region such as clade-specific evolutionary parameters, presence/absence of nearby coding regions, etc. We present a new method for TFBS prediction in metazoan genomes that utilizes both the CRM architecture of sequences and a variety of features of individual motifs. Our proposed approach is based on a discriminative probabilistic model known as conditional random fields that explicitly optimizes the predictive probability of motif presence in large sequences, based on the joint effect of all such features. Results: This model overcomes weaknesses in earlier methods based on less effective statistical formalisms that are sensitive to spurious signals in the data. We evaluate our method on both simulated CRMs and real Drosophila sequences in comparison with a wide spectrum of existing models, and outperform the state of the art by 22% in F1 score. Availability and Implementation: The code is publicly available at http://www.sailing.cs.cmu.edu/discover.html. Contact: epxing@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19478006

  20. [Psychopathological study of lie motif in schizophrenia].

    PubMed

    Otsuka, Koichiro; Kato, Satoshi

    2006-01-01

    The theme of a statement is called "lie motif" by the authors when schizophrenic patients say "I have lied to anybody". We tried to analyse of the psychopathological characteristics and anthropological meanings of the lie motifs in schizophrenia, which has not been thematically examined until now, based on 4 cases, and contrasting with the lie motif (Lügenmotiv) in depression taken up by A. Kraus (1989). We classified the lie motifs in schizophrenia into the following two types: a) the past directive lie motif: the patients speak about their real lie regarding it as a 'petty fault' in their distant past with self-guilty feeling, b) the present directive lie motif: the patients say repeatedly 'I have lied' (about their present speech and behavior), retreating from their previous commitments. The observed false confessions of innocent fault by the patients seem to belong to the present directed lie motif. In comparison with the lie motif in depression, it is characteristic for the lie motif in schizophrenia that the patients feel themselves to already have been caught out by others before they confess the lie. The lie motif in schizophrenia seems to come into being through the attribution process of taking the others' blame on ones' own shoulders, which has been pointed out to be common in the guilt experience in schizophrenia. The others' blame on this occasion is due to "the others' gaze" in the experience of the initial self-centralization (i.e. non delusional self-referential experience) in the early stage of schizophrenia (S. Kato 1999). The others' gaze is supposed to bring about the feeling of amorphous self-revelation which could also be regarded as the guilt feeling without content, to the patients. When the guilt feeling is bound with a past concrete fault, the patients tell the past directive lie motif. On the other hand, when the patients cannot find a past fixed content, and feel their present actions as uncertain and experience them as lies, the

  1. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  2. Large Putative PEST-like Sequence Motif at the Carboxyl Tail of Human Calcium Receptor Directs Lysosomal Degradation and Regulates Cell Surface Receptor Level*

    PubMed Central

    Zhuang, Xiaolei; Northup, John K.; Ray, Kausik

    2012-01-01

    A deletion between amino acid residues Ser895 and Val1075 in the carboxyl terminus of the human calcium receptor (hCaR), which causes autosomal dominant hypocalcemia, showed enhanced signaling activity and increased cell surface expression in HEK293 cells (Lienhardt, A., Garabédian, M. G., Bai, M., Sinding, C., Zhang, Z., Lagarde, J. P., Boulesteix, J., Rigaud, M., Brown, E. M., and Kottler, M. L. (2000) J. Clin. Endocrinol. Metab. 85, 1695–1702). To identify the underlying mechanism(s) for these increases, we investigated the effects of carboxyl tail truncation and deletion in hCaR mutants using a combination of biochemical and cell imaging approaches to define motifs that participate in regulating cell surface numbers of this G protein-coupled receptor. Our data indicate a rapid constitutive receptor internalization of the cell surface hCaR, accumulating in early (Rab7 positive) and late endosomal (LAMP1 positive) sorting compartments, before targeting to lysosomes for degradation. Recycling of hCaR back to the cell surface was also evident. Truncation and deletion mapping defined a 51-amino acid sequence between residues 920 and 970 that is required for targeting to lysosomes and degradation but not for internalization or recycling of the receptor. No singular sequence motif was identified, instead the required sequence elements seem to distribute throughout this entire interval. This interval includes a high proportion of acidic and hydroxylated amino acid residues, suggesting a similarity to PEST-like degradation motif (PESTfind score of +10) and several glutamine repeats. The results define a novel large PEST-like sequence that participates in the sorting of internalized hCaR routed to the lysosomal/degradation pathway that regulates cell surface receptor numbers. PMID:22158862

  3. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs.

    PubMed

    Gromadzka, Agnieszka M; Steckelberg, Anna-Lena; Singh, Kusum K; Hofmann, Kay; Gehring, Niels H

    2016-03-18

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs.

  4. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  5. Protospacer recognition motifs

    PubMed Central

    Shah, Shiraz A.; Erdmann, Susanne; Mojica, Francisco J.M.; Garrett, Roger A.

    2013-01-01

    Protospacer adjacent motifs (PAMs) were originally characterized for CRISPR-Cas systems that were classified on the basis of their CRISPR repeat sequences. A few short 2–5 bp sequences were identified adjacent to one end of the protospacers. Experimental and bioinformatical results linked the motif to the excision of protospacers and their insertion into CRISPR loci. Subsequently, evidence accumulated from different virus- and plasmid-targeting assays, suggesting that these motifs were also recognized during DNA interference, at least for the recently classified type I and type II CRISPR-based systems. The two processes, spacer acquisition and protospacer interference, employ different molecular mechanisms, and there is increasing evidence to suggest that the sequence motifs that are recognized, while overlapping, are unlikely to be identical. In this article, we consider the properties of PAM sequences and summarize the evidence for their dual functional roles. It is proposed to use the terms protospacer associated motif (PAM) for the conserved DNA sequence and to employ spacer acqusition motif (SAM) and target interference motif (TIM), respectively, for acquisition and interference recognition sites. PMID:23403393

  6. Transcription factors that directly regulate the expression of CSLA9 encoding mannan synthase in Arabidopsis thaliana.

    PubMed

    Kim, Won-Chan; Reca, Ida-Barbara; Kim, Yongsig; Park, Sunchung; Thomashow, Michael F; Keegstra, Kenneth; Han, Kyung-Hwan

    2014-03-01

    Mannans are hemicellulosic polysaccharides that have a structural role and serve as storage reserves during plant growth and development. Previous studies led to the conclusion that mannan synthase enzymes in several plant species are encoded by members of the cellulose synthase-like A (CSLA) gene family. Arabidopsis has nine members of the CSLA gene family. Earlier work has shown that CSLA9 is responsible for the majority of glucomannan synthesis in both primary and secondary cell walls of Arabidopsis inflorescence stems. Little is known about how expression of the CLSA9 gene is regulated. Sequence analysis of the CSLA9 promoter region revealed the presence of multiple copies of a cis-regulatory motif (M46RE) recognized by transcription factor MYB46, leading to the hypothesis that MYB46 (At5g12870) is a direct regulator of the mannan synthase CLSA9. We obtained several lines of experimental evidence in support of this hypothesis. First, the expression of CSLA9 was substantially upregulated by MYB46 overexpression. Second, electrophoretic mobility shift assay (EMSA) was used to demonstrate the direct binding of MYB46 to the promoter of CSLA9 in vitro. This interaction was further confirmed in vivo by a chromatin immunoprecipitation assay. Finally, over-expression of MYB46 resulted in a significant increase in mannan content. Considering the multifaceted nature of MYB46-mediated transcriptional regulation of secondary wall biosynthesis, we reasoned that additional transcription factors are involved in the CSLA9 regulation. This hypothesis was tested by carrying out yeast-one hybrid screening, which identified ANAC041 and bZIP1 as direct regulators of CSLA9. Transcriptional activation assays and EMSA were used to confirm the yeast-one hybrid results. Taken together, we report that transcription factors ANAC041, bZIP1 and MYB46 directly regulate the expression of CSLA9.

  7. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  8. Motif enrichment tool.

    PubMed

    Blatti, Charles; Sinha, Saurabh

    2014-07-01

    The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/.

  9. Cross-disciplinary detection and analysis of network motifs.

    PubMed

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein-protein interaction network, primary school contact network, Zachary's karate club network, and co-purchase of political books network can be classified into a superfamily.

  10. No tradeoff between versatility and robustness in gene circuit motifs

    NASA Astrophysics Data System (ADS)

    Payne, Joshua L.

    2016-05-01

    Circuit motifs are small directed subgraphs that appear in real-world networks significantly more often than in randomized networks. In the Boolean model of gene circuits, most motifs are realized by multiple circuit genotypes. Each of a motif's constituent circuit genotypes may have one or more functions, which are embodied in the expression patterns the circuit forms in response to specific initial conditions. Recent enumeration of a space of nearly 17 million three-gene circuit genotypes revealed that all circuit motifs have more than one function, with the number of functions per motif ranging from 12 to nearly 30,000. This indicates that some motifs are more functionally versatile than others. However, the individual circuit genotypes that constitute each motif are less robust to mutation if they have many functions, hinting that functionally versatile motifs may be less robust to mutation than motifs with few functions. Here, I explore the relationship between versatility and robustness in circuit motifs, demonstrating that functionally versatile motifs are robust to mutation despite the inherent tradeoff between versatility and robustness at the level of an individual circuit genotype.

  11. Structural Motif-Based Homology Modeling of CYP27A1 and Site-Directed Mutational Analyses Affecting Vitamin D Hydroxylation

    PubMed Central

    Prosser, David E.; Guo, YuDing; Jia, Zongchao; Jones, Glenville

    2006-01-01

    Human CYP27A1 is a mitochondrial cytochrome P450, which is principally found in the liver and plays important roles in the biological activation of vitamin D3 and in the biosynthesis of bile acids. We have applied a systematic analysis of hydrogen bonding patterns in 11 prokaryotic and mammalian CYP crystal structures to construct a homology-based model of CYP27A1. Docking of vitamin D3 structures into the active site of this model identified potential substrate contact residues in the F-helix, the β-3 sheet, and the β-5 sheet. Site-directed mutagenesis and expression in COS-1 cells confirmed that these positions affect enzymatic activity, in some cases shifting metabolism of 1α-hydroxyvitamin D3 to favor 25- or 27-hydroxylation. The results suggest that conserved hydrophobic residues in the β-5 hairpin help define the shape of the substrate binding cavity and that this structure interacts with Phe-248 in the F-helix. Mutations directed toward the β-3a strand suggested a possible heme-binding interaction centered on Asn-403 and a structural role for substrate contact residues Thr-402 and Ser-404. PMID:16500955

  12. Motif-based embedding for graph clustering

    NASA Astrophysics Data System (ADS)

    Lim, Sungsu; Lee, Jae-Gil

    2016-12-01

    Community detection in complex networks is a fundamental problem that has been extensively studied owing to its wide range of applications. However, because community detection methods typically rely on the relations between vertices in networks, they may fail to discover higher-order graph substructures, called the network motifs. In this paper, we propose a novel embedding method for graph clustering that considers higher-order relationships involving multiple vertices. We show that our embedding method, which we call motif-based embedding, is more effective in detecting communities than existing graph embedding methods, spectral embedding and force-directed embedding, both theoretically and experimentally.

  13. Novel blocking human IgG directed against the pentapeptide repeat motifs of Neisseria meningitidis Lip/H.8 and Laz lipoproteins.

    PubMed

    Ray, Tathagat Dutta; Lewis, Lisa A; Gulati, Sunita; Rice, Peter A; Ram, Sanjay

    2011-04-15

    Ab-initiated, complement-dependent killing contributes to host defenses against invasive meningococcal disease. Sera from nonimmunized individuals vary widely in their bactericidal activity against group B meningococci. We show that IgG isolated from select individuals can block killing of group B meningococci by human sera that are otherwise bactericidal. This IgG also reduced the bactericidal efficacy of Abs directed against the group B meningococcal protein vaccine candidates factor H-binding protein currently undergoing clinical trials and Neisserial surface protein A. Immunoblots revealed that the blocking IgG was directed against a meningococcal Ag called H.8. Killing of meningococci in reactions containing bactericidal mAbs and human blocking Abs was restored when binding of blocking Ab to meningococci was inhibited using either synthetic peptides corresponding to H.8 or a nonblocking mAb against H.8. Furthermore, genetic deletion of H.8 from target organisms abrogated blocking. The Fc region of the blocking IgG was required for blocking because F(ab')(2) fragments were ineffective. Blocking required IgG glycosylation because deglycosylation with peptide:N-glycanase eliminated blocking. C4b deposition mediated by an anti-factor H-binding protein mAb was reduced by intact blocking IgG, but not by peptide:N-glycanase-treated blocking IgG, suggesting that blocking resulted from inhibition of classical pathway of complement. In conclusion, we have identified H.8 as a meningococcal target for novel blocking Abs in human serum. Such blocking Abs may reduce the efficacy of select antigroup B meningococcal protein vaccines. We also propose that outer membrane vesicle-containing meningococcal vaccines may be more efficacious if purged of subversive immunogens such as H.8.

  14. Network motifs modulate druggability of cellular targets

    PubMed Central

    Wu, Fan; Ma, Cong; Tan, Cheemeng

    2016-01-01

    Druggability refers to the capacity of a cellular target to be modulated by a small-molecule drug. To date, druggability is mainly studied by focusing on direct binding interactions between a drug and its target. However, druggability is impacted by cellular networks connected to a drug target. Here, we use computational approaches to reveal basic principles of network motifs that modulate druggability. Through quantitative analysis, we find that inhibiting self-positive feedback loop is a more robust and effective treatment strategy than inhibiting other regulations, and adding direct regulations to a drug-target generally reduces its druggability. The findings are explained through analytical solution of the motifs. Furthermore, we find that a consensus topology of highly druggable motifs consists of a negative feedback loop without any positive feedback loops, and consensus motifs with low druggability have multiple positive direct regulations and positive feedback loops. Based on the discovered principles, we predict potential genetic targets in Escherichia coli that have either high or low druggability based on their network context. Our work establishes the foundation toward identifying and predicting druggable targets based on their network topology. PMID:27824147

  15. Dynamic motifs of strategies in prisoner's dilemma games

    NASA Astrophysics Data System (ADS)

    Kim, Young Jin; Roh, Myungkyoon; Jeong, Seon-Young; Son, Seung-Woo

    2014-12-01

    We investigate the win-lose relations between strategies of iterated prisoner's dilemma games by using a directed network concept to display the replicator dynamics results. In the giant strongly-connected component of the win/lose network, we find win-lose circulations similar to rock-paper-scissors and analyze the fixed point and its stability. Applying the network motif concept, we introduce dynamic motifs, which describe the population dynamics relations among the three strategies. Through exact enumeration, we find 22 dynamic motifs and display their phase portraits. Visualization using directed networks and motif analysis is a useful method to make complex dynamic behavior simple in order to understand it more intuitively. Dynamic motifs can be building blocks for dynamic behavior among strategies when they are applied to other types of games.

  16. A G-Box-Like Motif Is Necessary for Transcriptional Regulation by Circadian Pseudo-Response Regulators in Arabidopsis1[OPEN

    PubMed Central

    Newton, Linsey; Liu, Ming-Jung

    2016-01-01

    PSEUDO-RESPONSE REGULATORs (PRRs) play overlapping and distinct roles in maintaining circadian rhythms and regulating diverse biological processes, including the photoperiodic control of flowering, growth, and abiotic stress responses. PRRs act as transcriptional repressors and associate with chromatin via their conserved C-terminal CCT (CONSTANS, CONSTANS-like, and TIMING OF CAB EXPRESSION 1 [TOC1/PRR1]) domains by a still-poorly understood mechanism. Here, we identified genome-wide targets of PRR9 using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) and compared them with PRR7, PRR5, and TOC1/PRR1 ChIP-seq data. We found that PRR binding sites are located within genomic regions of low nucleosome occupancy and high DNase I hypersensitivity. Moreover, conserved noncoding regions among Brassicaceae species are enriched around PRR binding sites, indicating that PRRs associate with functionally relevant cis-regulatory regions. The PRRs shared a significant number of binding regions, and our results indicate that they coordinately restrict the expression of target genes to around dawn. A G-box-like motif was overrepresented at PRR binding regions, and we showed that this motif is necessary for mediating transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 and PRR9 by the PRRs. Our results further our understanding of how PRRs target specific promoters and provide an extensive resource for studying circadian regulatory networks in plants. PMID:26586835

  17. Conserved stem-loop structures in the HIV-1 RNA region containing the A3 3' splice site and its cis-regulatory element: possible involvement in RNA splicing.

    PubMed

    Jacquenet, S; Ropers, D; Bilodeau, P S; Damier, L; Mougin, A; Stoltzfus, C M; Branlant, C

    2001-01-15

    The HIV-1 transcript is alternatively spliced to over 30 different mRNAs. Whether RNA secondary structure can influence HIV-1 RNA alternative splicing has not previously been examined. Here we have determined the secondary structure of the HIV-1/BRU RNA segment, containing the alternative A3, A4a, A4b, A4c and A5 3' splice sites. Site A3, required for tat mRNA production, is contained in the terminal loop of a stem-loop structure (SLS2), which is highly conserved in HIV-1 and related SIVcpz strains. The exon splicing silencer (ESS2) acting on site A3 is located in a long irregular stem-loop structure (SLS3). Two SLS3 domains were protected by nuclear components under splicing condition assays. One contains the A4c branch points and a putative SR protein binding site. The other one is adjacent to ESS2. Unexpectedly, only the 3' A residue of ESS2 was protected. The suboptimal A3 polypyrimidine tract (PPT) is base paired. Using site-directed mutagenesis and transfection of a mini-HIV-1 cDNA into HeLa cells, we found that, in a wild-type PPT context, a mutation of the A3 downstream sequence that reinforced SLS2 stability decreased site A3 utilization. This was not the case with an optimized PPT. Hence, sequence and secondary structure of the PPT may cooperate in limiting site A3 utilization.

  18. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-10

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  19. Triadic motifs in the dependence networks of virtual societies

    PubMed Central

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  20. Motif Yggdrasil: sampling sequence motifs from a tree mixture model.

    PubMed

    Andersson, Samuel A; Lagergren, Jens

    2007-06-01

    In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version.

  1. Cis-regulatory control of corticospinal system development and evolution

    PubMed Central

    Shim, Sungbo; Kwan, Kenneth Y.; Li, Mingfeng; Lefebvre, Veronique; Šestan, Nenad

    2012-01-01

    Summary The co-emergence of a six-layered cerebral neocortex and its corticospinal output system is one of the evolutionary hallmarks of mammals. However, the genetic programs that underlie their development and evolution remain poorly understood. Here we identify a conserved non-exonic element (E4) that acts as a cortex-specific enhancer for the nearby Fezf2, which is required for the specification of corticospinal neuron identity and connectivity. We find that SOX4 and SOX11 functionally compete with the repressor SOX5 in the trans-activation of E4. Cortex-specific double deletion of Sox4 and Sox11 leads to the loss of Fezf2 expression and failed specification of corticospinal neurons and, independent of Fezf2, a reeler-like inversion of layers. We show evidence supporting the emergence of functional SOX binding sites in E4 during tetrapod evolution and their subsequent stabilization in mammals and possibly amniotes. These findings reveal that SOX transcription factors converge onto a cis-acting element of Fezf2 and form critical components of a regulatory network controlling the identity and connectivity of corticospinal neurons. PMID:22678282

  2. Cis-regulatory RNA elements that regulate specialized ribosome activity

    PubMed Central

    Xue, Shifeng; Barna, Maria

    2015-01-01

    Recent evidence has shown that the ribosome itself can play a highly regulatory role in the specialized translation of specific subpools of mRNAs, in particular at the level of ribosomal proteins (RP). However, the mechanism(s) by which this selection takes place has remained poorly understood. In our recent study, we discovered a combination of unique RNA elements in the 5′UTRs of mRNAs that allows for such control by the ribosome. These mRNAs contain a Translation Inhibitory Element (TIE) that inhibits general cap-dependent translation, and an Internal Ribosome Entry Site (IRES) that relies on a specific RP for activation. The unique combination of an inhibitor of general translation and an activator of specialized translation is key to ribosome-mediated control of gene expression. Here we discuss how these RNA regulatory elements provide a new level of control to protein expression and their implications for gene expression, organismal development and evolution. PMID:26327194

  3. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.

  4. Knowledge discovery of multilevel protein motifs

    SciTech Connect

    Conklin, D.; Glasgow, J.; Fortier, S.

    1994-12-31

    A new category of protein motif is introduced. This type of motif captures, in addition to global structure, the nested structure of its component parts. A dataset of four proteins is represented using this scheme. A structured machine discovery procedure is used to discover recurrent amino acid motifs and this knowledge is utilized for the expression of subsequent protein motif discoveries. Examples of discovered multilevel motifs are presented.

  5. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  6. The distribution of RNA motifs in natural sequences.

    PubMed

    Bourdeau, V; Ferbeyre, G; Pageau, M; Paquin, B; Cedergren, R

    1999-11-15

    Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.

  7. Neural Circuits: Male Mating Motifs.

    PubMed

    Benton, Richard

    2015-09-02

    Characterizing microcircuit motifs in intact nervous systems is essential to relate neural computations to behavior. In this issue of Neuron, Clowney et al. (2015) identify recurring, parallel feedforward excitatory and inhibitory pathways in male Drosophila's courtship circuitry, which might explain decisive mate choice.

  8. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

    SciTech Connect

    Joslyn, Cliff A.; al-Saffar, Sinan; Haglin, David J.; Holder, Larry

    2011-06-14

    Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

  9. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  10. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  11. NetMODE: Network Motif Detection without Nauty

    PubMed Central

    Wang, Haidong; Deng, Hualiang; Liu, Xiaoguang; Wang, Gang

    2012-01-01

    A motif in a network is a connected graph that occurs significantly more frequently as an induced subgraph than would be expected in a similar randomized network. By virtue of being atypical, it is thought that motifs might play a more important role than arbitrary subgraphs. Recently, a flurry of advances in the study of network motifs has created demand for faster computational means for identifying motifs in increasingly larger networks. Motif detection is typically performed by enumerating subgraphs in an input network and in an ensemble of comparison networks; this poses a significant computational problem. Classifying the subgraphs encountered, for instance, is typically performed using a graph canonical labeling package, such as Nauty, and will typically be called billions of times. In this article, we describe an implementation of a network motif detection package, which we call NetMODE. NetMODE can only perform motif detection for -node subgraphs when , but does so without the use of Nauty. To avoid using Nauty, NetMODE has an initial pretreatment phase, where -node graph data is stored in memory (). For we take a novel approach, which relates to the Reconstruction Conjecture for directed graphs. We find that NetMODE can perform up to around times faster than its predecessors when and up to around times faster when (the exact improvement varies considerably). NetMODE also (a) includes a method for generating comparison graphs uniformly at random, (b) can interface with external packages (e.g. R), and (c) can utilize multi-core architectures. NetMODE is available from netmode.sf.net. PMID:23272055

  12. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs.

    PubMed

    Zheng, Yiyu; Li, Xiaoman; Hu, Haiyan

    2015-01-01

    Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5' distal regions were often enriched in 3' distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/.

  13. Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder

    PubMed Central

    Sharov, Alexei A.; Ko, Minoru S.H.

    2009-01-01

    We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences. PMID:19740934

  14. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

    PubMed

    Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

    2017-02-01

    An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds.

  15. A survey of DNA motif finding algorithms

    PubMed Central

    Das, Modan K; Dai, Ho-Kwok

    2007-01-01

    Background Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms. Results Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms. Conclusion Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of

  16. The Thiamine-Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Ciszak, Ewa; Dominiak, Paulina

    2004-01-01

    Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.

  17. Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages

    PubMed Central

    2011-01-01

    Background Deficiency of the transcription factor MafB, which is normally expressed in macrophages, can underlie cellular dysfunction associated with a range of autoimmune diseases and arteriosclerosis. MafB has important roles in cell differentiation and regulation of target gene expression; however, the mechanisms of this regulation and the identities of other transcription factors with which MafB interacts remain uncertain. Bioinformatics methods provide a valuable approach for elucidating the nature of these interactions with transcriptional regulatory elements from a large number of DNA sequences. In particular, identification of patterns of co-occurrence of regulatory cis-elements (motifs) offers a robust approach. Results Here, the directional relationships among several functional motifs were evaluated using the Log-linear Graphical Model (LGM) after extraction and search for evolutionarily conserved motifs. This analysis highlighted GATA-1 motifs and 5’AT-rich half Maf recognition elements (MAREs) in promoter regions of 18 genes that were down-regulated in Mafb deficient macrophages. GATA-1 motifs and MafB motifs could regulate expression of these genes in both a negative and positive manner, respectively. The validity of this conclusion was tested with data from a luciferase assay that used a C1qa promoter construct carrying both the GATA-1 motifs and MAREs. GATA-1 was found to inhibit the activity of the C1qa promoter with the GATA-1 motifs and MafB motifs. Conclusions These observations suggest that both the GATA-1 motifs and MafB motifs are important for lineage specific expression of C1qa. In addition, these findings show that analysis of combinations of evolutionarily conserved motifs can be successfully used to identify patterns of gene regulation. PMID:22784578

  18. MSDmotif: exploring protein sites and motifs

    PubMed Central

    Golovin, Adel; Henrick, Kim

    2008-01-01

    Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB) is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS) protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures. PMID:18637174

  19. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-02-20

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.

  20. Motif-based construction of a functional map for mammalian olfactory receptors.

    PubMed

    Liu, Agatha H; Zhang, Xinmin; Stolovitzky, Gustavo A; Califano, Andrea; Firestein, Stuart J

    2003-05-01

    We applied an automatic and unsupervised system to a nearly complete database of mammalian odor receptor genes. The generated motifs and gene classification were subjected to extensive and systematic downstream analysis to obtain biological insights. Two major results from this analysis were: (1) a map of sequence motifs that may correlate with function and (2) the corresponding receptor classes in which members of each class are likely to share specific functions. We have discovered motifs that have been implicated in structural integrity and posttranslational modification, as well as motifs very likely to be directly involved in ligand binding. We further propose a combinatorial molecular hypothesis, based on unique combinations of the observed motifs, that provides a foundation for understanding the generation of a large number of ligand binding sites.

  1. Proximity of Radiation Desiccation Response Motif to the core promoter is essential for basal repression as well as gamma radiation-induced gyrB gene expression in Deinococcus radiodurans.

    PubMed

    Anaganti, Narasimha; Basu, Bhakti; Mukhopadhyaya, Rita; Apte, Shree Kumar

    2017-03-02

    The radioresistant D. radiodurans regulates its DNA damage regulon (DDR) through interaction between a 17bp palindromic cis-regulatory element called the Radiation Desiccation Response Motif (RDRM), the DdrO repressor and a protease IrrE. The role of RDRM in regulation of DDR was dissected by constructing RDRM sequence-, position- or deletion-variants of Deinococcal gyrB gene (DR0906) promoter and by RDRM insertion in the non-RDRM groESL gene (DR0606) promoter, and monitoring the effect of such modifications on the basal as well as gamma radiation inducible promoter activity by quantifying fluorescence of a GFP reporter. RDRM sequence-variants revealed that the conservation of sequence at the 5th and 13th position and the ends of RDRM is essential for basal repression by interaction with DdrO. RDRM position-variants showed that the sequence acts as a negative regulatory element only when located around transcription start site (TSS) and within the span of RNA polymerase (RNAP) binding region. RDRM deletion-variants indicated that the 5' sequence of RDRM possibly possesses an enhancer-like element responsible for higher expression yields upon repressor clearance post-irradiation. The results suggest that RDRM plays both a negative as well as a positive role of in the regulation of DDR in D. radiodurans.

  2. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  3. The EDLL motif: a potent plant transcriptional activation domain from AP2/ERF transcription factors.

    PubMed

    Tiwari, Shiv B; Belachew, Alemu; Ma, Siu Fong; Young, Melinda; Ade, Jules; Shen, Yu; Marion, Colleen M; Holtan, Hans E; Bailey, Adina; Stone, Jeffrey K; Edwards, Leslie; Wallace, Andreah D; Canales, Roger D; Adam, Luc; Ratcliffe, Oliver J; Repetti, Peter P

    2012-06-01

    In plants, the ERF/EREBP family of transcriptional regulators plays a key role in adaptation to various biotic and abiotic stresses. These proteins contain a conserved AP2 DNA-binding domain and several uncharacterized motifs. Here, we describe a short motif, termed 'EDLL', that is present in AtERF98/TDR1 and other clade members from the same AP2 sub-family. We show that the EDLL motif, which has a unique arrangement of acidic amino acids and hydrophobic leucines, functions as a strong activation domain. The motif is transferable to other proteins, and is active at both proximal and distal positions of target promoters. As such, the EDLL motif is able to partly overcome the repression conferred by the AtHB2 transcription factor, which contains an ERF-associated amphiphilic repression (EAR) motif. We further examined the activation potential of EDLL by analysis of the regulation of flowering time by NF-Y (nuclear factor Y) proteins. Genetic evidence indicates that NF-Y protein complexes potentiate the action of CONSTANS in regulation of flowering in Arabidopsis; we show that the transcriptional activation function of CONSTANS can be substituted by direct fusion of the EDLL activation motif to NF-YB subunits. The EDLL motif represents a potent plant activation domain that can be used as a tool to confer transcriptional activation potential to heterologous DNA-binding proteins.

  4. Efficient motif search in ranked lists and applications to variable gap motifs.

    PubMed

    Leibovich, Limor; Yakhini, Zohar

    2012-07-01

    Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs-two half sites with a flexible length gap in between-and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation.

  5. VARUN: discovering extensible motifs under saturation constraints.

    PubMed

    Apostolico, Alberto; Comin, Matteo; Parida, Laxmi

    2010-01-01

    The discovery of motifs in biosequences is frequently torn between the rigidity of the model on one hand and the abundance of candidates on the other hand. In particular, motifs that include wild cards or "don't cares" escalate exponentially with their number, and this gets only worse if a don't care is allowed to stretch up to some prescribed maximum length. In this paper, a notion of extensible motif in a sequence is introduced and studied, which tightly combines the structure of the motif pattern, as described by its syntactic specification, with the statistical measure of its occurrence count. It is shown that a combination of appropriate saturation conditions and the monotonicity of probabilistic scores over regions of constant frequency afford us significant parsimony in the generation and testing of candidate overrepresented motifs. A suite of software programs called Varun is described, implementing the discovery of extensible motifs of the type considered. The merits of the method are then documented by results obtained in a variety of experiments primarily targeting protein sequence families. Of equal importance seems the fact that the sets of all surprising motifs returned in each experiment are extracted faster and come in much more manageable sizes than would be obtained in the absence of saturation constraints.

  6. Efficient motif search in ranked lists and applications to variable gap motifs

    PubMed Central

    Leibovich, Limor; Yakhini, Zohar

    2012-01-01

    Sequence elements, at all levels—DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs—two half sites with a flexible length gap in between—and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation. PMID:22416066

  7. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

    PubMed

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-02-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1,28,000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.

  8. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas.

    PubMed

    Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B

    2013-10-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.

  9. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  10. Identification of a putative nuclear export signal motif in human NANOG homeobox domain

    SciTech Connect

    Park, Sung-Won; Do, Hyun-Jin; Huh, Sun-Hyung; Sung, Boreum; Uhm, Sang-Jun; Song, Hyuk; Kim, Nam-Hyung; Kim, Jae-Hwan

    2012-05-11

    Highlights: Black-Right-Pointing-Pointer We found the putative nuclear export signal motif within human NANOG homeodomain. Black-Right-Pointing-Pointer Leucine-rich residues are important for human NANOG homeodomain nuclear export. Black-Right-Pointing-Pointer CRM1-specific inhibitor LMB blocked the potent human NANOG NES-mediated nuclear export. -- Abstract: NANOG is a homeobox-containing transcription factor that plays an important role in pluripotent stem cells and tumorigenic cells. To understand how nuclear localization of human NANOG is regulated, the NANOG sequence was examined and a leucine-rich nuclear export signal (NES) motif ({sup 125}MQELSNILNL{sup 134}) was found in the homeodomain (HD). To functionally validate the putative NES motif, deletion and site-directed mutants were fused to an EGFP expression vector and transfected into COS-7 cells, and the localization of the proteins was examined. While hNANOG HD exclusively localized to the nucleus, a mutant with both NLSs deleted and only the putative NES motif contained (hNANOG HD-{Delta}NLSs) was predominantly cytoplasmic, as observed by nucleo/cytoplasmic fractionation and Western blot analysis as well as confocal microscopy. Furthermore, site-directed mutagenesis of the putative NES motif in a partial hNANOG HD only containing either one of the two NLS motifs led to localization in the nucleus, suggesting that the NES motif may play a functional role in nuclear export. Furthermore, CRM1-specific nuclear export inhibitor LMB blocked the hNANOG potent NES-mediated export, suggesting that the leucine-rich motif may function in CRM1-mediated nuclear export of hNANOG. Collectively, a NES motif is present in the hNANOG HD and may be functionally involved in CRM1-mediated nuclear export pathway.

  11. Network motif identification in stochastic networks

    NASA Astrophysics Data System (ADS)

    Jiang, Rui; Tu, Zhidong; Chen, Ting; Sun, Fengzhu

    2006-06-01

    Network motifs have been identified in a wide range of networks across many scientific disciplines and are suggested to be the basic building blocks of most complex networks. Nonetheless, many networks come with intrinsic and/or experimental uncertainties and should be treated as stochastic networks. The building blocks in these networks thus may also have stochastic properties. In this article, we study stochastic network motifs derived from families of mutually similar but not necessarily identical patterns of interconnections. We establish a finite mixture model for stochastic networks and develop an expectation-maximization algorithm for identifying stochastic network motifs. We apply this approach to the transcriptional regulatory networks of Escherichia coli and Saccharomyces cerevisiae, as well as the protein-protein interaction networks of seven species, and identify several stochastic network motifs that are consistent with current biological knowledge. expectation-maximization algorithm | mixture model | transcriptional regulatory network | protein-protein interaction network

  12. iMotifs: an integrated sequence motif visualization and analysis environment

    PubMed Central

    Piipari, Matias; Down, Thomas A.; Saini, Harpreet; Enright, Anton; Hubbard, Tim J.P.

    2010-01-01

    Motivation: Short sequence motifs are an important class of models in molecular biology, used most commonly for describing transcription factor binding site specificity patterns. High-throughput methods have been recently developed for detecting regulatory factor binding sites in vivo and in vitro and consequently high-quality binding site motif data are becoming available for increasing number of organisms and regulatory factors. Development of intuitive tools for the study of sequence motifs is therefore important. iMotifs is a graphical motif analysis environment that allows visualization of annotated sequence motifs and scored motif hits in sequences. It also offers motif inference with the sensitive NestedMICA algorithm, as well as overrepresentation and pairwise motif matching capabilities. All of the analysis functionality is provided without the need to convert between file formats or learn different command line interfaces. The application includes a bundled and graphically integrated version of the NestedMICA motif inference suite that has no outside dependencies. Problems associated with local deployment of software are therefore avoided. Availability: iMotifs is licensed with the GNU Lesser General Public License v2.0 (LGPL 2.0). The software and its source is available at http://wiki.github.com/mz2/imotifs and can be run on Mac OS X Leopard (Intel/PowerPC). We also provide a cross-platform (Linux, OS X, Windows) LGPL 2.0 licensed library libxms for the Perl, Ruby, R and Objective-C programming languages for input and output of XMS formatted annotated sequence motif set files. Contact: matias.piipari@gmail.com; imotifs@googlegroups.com PMID:20106815

  13. New type of starch-binding domain: the direct repeat motif in the C-terminal region of Bacillus sp. no. 195 alpha-amylase contributes to starch binding and raw starch degrading.

    PubMed Central

    Sumitani, J; Tottori, T; Kawaguchi, T; Arai, M

    2000-01-01

    The alpha-amylase from Bacillus sp. no. 195 (BAA) consists of two domains: one is the catalytic domain similar to alpha-amylases from animals and Streptomyces in the N-terminal region; the other is the functionally unknown domain composed of an approx. 90-residue direct repeat in the C-terminal region. The gene coding for BAA was expressed in Streptomyces lividans TK24. Three active forms of the gene products were found. The pH and thermal profiles of BAAs, and their catalytic activities for p-nitrophenyl maltopentaoside and soluble starch, showed almost the same behaviours. The largest, 69 kDa, form (BAA-alpha) was of the same molecular mass as that of the mature protein estimated from the nucleotide sequence, and had raw-starch-binding and -degrading abilities. The second largest, 60 kDa, form (BAA-beta), whose molecular mass was the same as that of the natural enzyme from Bacillus sp. no. 195, was generated by proteolytic processing between the two repeat sequences in the C-terminal region, and had lower activities for raw starch binding and degrading than those of BAA-alpha. The smallest, 50 kDa, form (BAA-gamma) contained only the N-terminal catalytic domain as a result of removal of the C-terminal repeat sequence, which led to loss of binding and degradation of insoluble starches. Thus the starch adsorption capacity and raw-starch-degrading activity of BAAs depends on the existence of the repeat sequence in the C-terminal region. BAA-alpha was specifically adsorbed on starch or dextran (alpha-1,4 or alpha-1,6 glucan), and specifically desorbed with maltose or beta-cyclodextrin. These observations indicated that the repeat sequence of the enzyme was functional in the starch-binding domain (SBD). We propose the designation of the homologues to the SBD of glucoamylase from Aspergillus niger as family I SBDs, the homologues to that of glucoamylase from Rhizopus oryzae as family II, and the homologues of this repeat sequence of BAA as family III. PMID:10947962

  14. New type of starch-binding domain: the direct repeat motif in the C-terminal region of Bacillus sp. no. 195 alpha-amylase contributes to starch binding and raw starch degrading.

    PubMed

    Sumitani, J; Tottori, T; Kawaguchi, T; Arai, M

    2000-09-01

    The alpha-amylase from Bacillus sp. no. 195 (BAA) consists of two domains: one is the catalytic domain similar to alpha-amylases from animals and Streptomyces in the N-terminal region; the other is the functionally unknown domain composed of an approx. 90-residue direct repeat in the C-terminal region. The gene coding for BAA was expressed in Streptomyces lividans TK24. Three active forms of the gene products were found. The pH and thermal profiles of BAAs, and their catalytic activities for p-nitrophenyl maltopentaoside and soluble starch, showed almost the same behaviours. The largest, 69 kDa, form (BAA-alpha) was of the same molecular mass as that of the mature protein estimated from the nucleotide sequence, and had raw-starch-binding and -degrading abilities. The second largest, 60 kDa, form (BAA-beta), whose molecular mass was the same as that of the natural enzyme from Bacillus sp. no. 195, was generated by proteolytic processing between the two repeat sequences in the C-terminal region, and had lower activities for raw starch binding and degrading than those of BAA-alpha. The smallest, 50 kDa, form (BAA-gamma) contained only the N-terminal catalytic domain as a result of removal of the C-terminal repeat sequence, which led to loss of binding and degradation of insoluble starches. Thus the starch adsorption capacity and raw-starch-degrading activity of BAAs depends on the existence of the repeat sequence in the C-terminal region. BAA-alpha was specifically adsorbed on starch or dextran (alpha-1,4 or alpha-1,6 glucan), and specifically desorbed with maltose or beta-cyclodextrin. These observations indicated that the repeat sequence of the enzyme was functional in the starch-binding domain (SBD). We propose the designation of the homologues to the SBD of glucoamylase from Aspergillus niger as family I SBDs, the homologues to that of glucoamylase from Rhizopus oryzae as family II, and the homologues of this repeat sequence of BAA as family III.

  15. Distal Regions of the Human IFNG Locus Direct Cell Type-Specific Expression

    PubMed Central

    Collins, Patrick L.; Chang, Shaojing; Henderson, Melodie; Soutto, Mohammed; Davis, Georgia M.; McLoed, Allyson G.; Townsend, Michael J.; Glimcher, Laurie H.; Mortlock, Douglas P.; Aune, Thomas M.

    2010-01-01

    Genes, such as IFNG, which are expressed in multiple cell lineages of the immune system, may employ a common set of regulatory elements to direct transcription in multiple cell types or individual regulatory elements to direct expression in individual cell lineages. By employing a bacterial artificial chromosome transgenic system, we demonstrate that IFNG employs unique regulatory elements to achieve lineage-specific transcriptional control. Specifically, a one 1-kb element 30 kb upstream of IFNG activates transcription in T cells and NKT cells but not in NK cells. This distal regulatory element is a Runx3 binding site in Th1 cells and is needed for RNA polymerase II recruitment to IFNG, but it is not absolutely required for histone acetylation of the IFNG locus. These results support a model whereby IFNG utilizes cis-regulatory elements with cell type-restricted function. PMID:20574006

  16. Caveats in modeling a common motif in genetic circuits

    NASA Astrophysics Data System (ADS)

    Labavić, Darka; Nagel, Hannes; Janke, Wolfhard; Meyer-Ortmanns, Hildegard

    2013-06-01

    From a coarse-grained perspective, the motif of a self-activating species, activating a second species that acts as its own repressor, is widely found in biological systems, in particular in genetic systems with inherent oscillatory behavior. Here we consider a specific realization of this motif as a genetic circuit, termed the bistable frustrated unit, in which genes are described as directly producing proteins. Upon an improved resolution in time, we focus on the effect that inherent time scales on the underlying scale can have on the bifurcation patterns on a coarser scale. Time scales are set by the binding and unbinding rates of the transcription factors to the promoter regions of the genes. Depending on the ratio of these rates to the decay times of both proteins, the appropriate averaging procedure for obtaining a coarse-grained description changes and leads to sets of deterministic equations, which considerably differ in their bifurcation structure. In particular, the desired intermediate range of regular limit cycles fades away when the binding rates of genes are not fast as compared to the decay time of the proteins. Our analysis illustrates that the common topology of the widely found motif alone does not imply universal features in the dynamics.

  17. Characteristic motifs for families of allergenic proteins

    PubMed Central

    Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

    2008-01-01

    The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633

  18. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  19. A Combinatorial Code for Splicing Silencing: UAGG and GGGG Motifs

    PubMed Central

    An, Ping; Burge, Christopher B

    2005-01-01

    Alternative pre-mRNA splicing is widely used to regulate gene expression by tuning the levels of tissue-specific mRNA isoforms. Few regulatory mechanisms are understood at the level of combinatorial control despite numerous sequences, distinct from splice sites, that have been shown to play roles in splicing enhancement or silencing. Here we use molecular approaches to identify a ternary combination of exonic UAGG and 5′-splice-site-proximal GGGG motifs that functions cooperatively to silence the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1) transcript. Disruption of three components of the motif pattern converted the CI cassette into a constitutive exon, while predominant skipping was conferred when the same components were introduced, de novo, into a heterologous constitutive exon. Predominant exon silencing was directed by the motif pattern in the presence of six competing exonic splicing enhancers, and this effect was retained after systematically repositioning the two exonic UAGGs within the CI cassette. In this system, hnRNP A1 was shown to mediate silencing while hnRNP H antagonized silencing. Genome-wide computational analysis combined with RT-PCR testing showed that a class of skipped human and mouse exons can be identified by searches that preserve the sequence and spatial configuration of the UAGG and GGGG motifs. This analysis suggests that the multi-component silencing code may play an important role in the tissue-specific regulation of the CI cassette exon, and that it may serve more generally as a molecular language to allow for intricate adjustments and the coordination of splicing patterns from different genes. PMID:15828859

  20. A motif rich in charged residues determines product specificity in isomaltulose synthase.

    PubMed

    Zhang, Daohai; Li, Nan; Swaminathan, Kunchithapadam; Zhang, Lian Hui

    2003-01-16

    Isomaltulose synthase (PalI) catalyzes hydrolysis of sucrose and formation of alpha-1,6 and alpha-1,1 bonds to produce isomaltulose (alpha-D-glucosylpyranosyl-1,6-D-fructofranose) and small amount of trehalulose (alpha-D-glucosylpyranosyl-1,1-D-fructofranose). A potential isomaltulose synthase-specific motif ((325)RLDRD(329)), that contains a 'DxD' motif conserved in many glycosyltransferases, was identified based on sequence comparison with reference to the secondary structural features of PalI and homologs. Site-directed mutagenesis analysis of the motif showed that the four charged amino acid residues (Arg(325), Arg(328), Asp(327) and Asp(329)) influence the enzyme kinetics and determine the product specificity. Mutation of these four residues increased trehalulose formation by 17-61% and decreased isomaltulose by 26-67%. We conclude that the 'RLDRD' motif controls the product specificity of PalI.

  1. WildSpan: mining structured motifs from protein sequences

    PubMed Central

    2011-01-01

    Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for

  2. Calendar motifs on Getashen hydria

    NASA Astrophysics Data System (ADS)

    Vrtanesyan, Garegin

    2015-07-01

    Getashen hydria was found in the tombs of the middle bronze age (the first third of the second Millennium B.C.) in Armenia (Lake Sevan). It shows a scene consisting of three friezes. On the lower frieze depicts six zoomorphic figures, on an average six frieze waterfowl, and on top, is the graphic signs. Calendar motives of this composition have a numeric expression, six zoomorphic figures on the lower and middle friezes. Division of the annual cycle into two parts is known in the calendars of the ancient Indo-Iranian ("great summer" and "the great winter"). Animals on the lower frieze of the second mark, "winter" road of the Sun, because in this period are the most important events, ensuring the reproduction of the economy of the society. This rut ungulates - wild (deer) and domestic (goats). Moreover, the gon goats end in December, almost coinciding with the onset of the winter solstice. A couple of dogs on the lower frieze marks the version of the myth, imprisoned in the rock hero - the Sun (Mihr - Artavazd), to which his dogs have to chew the chains, anticipating his exit at the winter solstice. This is indicated by the direction of their movement, the Sun moves from left to right for an observer, only when located on the South side of the sky (i.e., beginning with the autumnal equinox). The most important event of the period of "summer road" of the Sun is the vernal equinox, which coincide with the arrival of waterfowl (ducks, geese). Their direction on the second frieze (left to right) corresponds to the position of the observer, facing North.

  3. CytoKavosh: A Cytoscape Plug-In for Finding Network Motifs in Large Biological Networks

    PubMed Central

    Razaghi Moghadam Kashani, Zahra; Salehzadeh-Yazdi, Ali; Khakabimamaghani, Sahand

    2012-01-01

    Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool. PMID:22952659

  4. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

    PubMed Central

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls. PMID:27489856

  5. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  6. Motifs and structural blocks retrieval by GHT

    NASA Astrophysics Data System (ADS)

    Cantoni, Virginio; Ferone, Alessio; Petrosino, Alfredo; Polat, Ozlem

    2014-06-01

    The structure of a protein gives more insight on the protein function than its amino acid sequence. Protein structure analysis and comparison are important for understanding the evolutionary relationships among proteins, predicting protein functions, and predicting protein folding. Proteins are formed by two basic regular 3D structural patterns, called Secondary Structures (SSs): helices and sheets. A structural motif is a compact 3D protein block referring to a small specific combination of secondary structural elements, which appears in a variety of molecules. In this paper we compare a few approaches for motif retrieval based on the Generalized Hough Transform (GHT). A primary technique is to adopt the single SS as structural primitives; alternatives are to adopt a SSs pair as primitive structural element, or a SSs triplet, and so on up-to an entire motif. The richer the primitive, the higher the time for pre-analysis and search, and the simpler the inspection process on the parameter space for analyzing the peaks. Performance comparisons, in terms of precision and computation time, are here presented considering the retrieval of motifs composed by three to five SSs for more than 15 million searches. The approach can be easily applied to the retrieval of greater blocks, up to protein domains, or even entire proteins.

  7. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  8. Discovering interacting domains and motifs in protein-protein interactions.

    PubMed

    Hugo, Willy; Sung, Wing-Kin; Ng, See-Kiong

    2013-01-01

    Many important biological processes, such as the signaling pathways, require protein-protein interactions (PPIs) that are designed for fast response to stimuli. These interactions are usually transient, easily formed, and disrupted, yet specific. Many of these transient interactions involve the binding of a protein domain to a short stretch (3-10) of amino acid residues, which can be characterized by a sequence pattern, i.e., a short linear motif (SLiM). We call these interacting domains and motifs domain-SLiM interactions. Existing methods have focused on discovering SLiMs in the interacting proteins' sequence data. With the recent increase in protein structures, we have a new opportunity to detect SLiMs directly from the proteins' 3D structures instead of their linear sequences. In this chapter, we describe a computational method called SLiMDIet to directly detect SLiMs on domain interfaces extracted from 3D structures of PPIs. SLiMDIet comprises two steps: (1) interaction interfaces belonging to the same domain are extracted and grouped together using structural clustering and (2) the extracted interaction interfaces in each cluster are structurally aligned to extract the corresponding SLiM. Using SLiMDIet, de novo SLiMs interacting with protein domains can be computationally detected from structurally clustered domain-SLiM interactions for PFAM domains which have available 3D structures in the PDB database.

  9. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  10. A Review of Functional Motifs Utilized by Viruses

    PubMed Central

    Sobhy, Haitham

    2016-01-01

    Short linear motifs (SLiM) are short peptides that facilitate protein function and protein-protein interactions. Viruses utilize these motifs to enter into the host, interact with cellular proteins, or egress from host cells. Studying functional motifs may help to predict protein characteristics, interactions, or the putative cellular role of a protein. In virology, it may reveal aspects of the virus tropism and help find antiviral therapeutics. This review highlights the recent understanding of functional motifs utilized by viruses. Special attention was paid to the function of proteins harboring these motifs, and viruses encoding these proteins. The review highlights motifs involved in (i) immune response and post-translational modifications (e.g., ubiquitylation, SUMOylation or ISGylation); (ii) virus-host cell interactions, including virus attachment, entry, fusion, egress and nuclear trafficking; (iii) virulence and antiviral activities; (iv) virion structure; and (v) low-complexity regions (LCRs) or motifs enriched with residues (Xaa-rich motifs). PMID:28248213

  11. The Assembly Motif of a Bacterial Small Multidrug Resistance Protein*

    PubMed Central

    Poulsen, Bradley E.; Rath, Arianna; Deber, Charles M.

    2009-01-01

    Multidrug transporters such as the small multidrug resistance (SMR) family of bacterial integral membrane proteins are capable of conferring clinically significant resistance to a variety of common therapeutics. As antiporter proteins of ∼100 amino acids, SMRs must self-assemble into homo-oligomeric structures for efflux of drug molecules. Oligomerization centered at transmembrane helix four (TM4) has been implicated in SMR assembly, but the full complement of residues required to mediate its self-interaction remains to be characterized. Here, we use Hsmr, the 110-residue SMR family member of the archaebacterium Halobacterium salinarum, to determine the TM4 residue motif required to mediate drug resistance and SMR self-association. Twelve single point mutants that scan the central portion of the TM4 helix (residues 85–104) were constructed and were tested for their ability to confer resistance to the cytotoxic compound ethidium bromide. Six residues were found to be individually essential for drug resistance activity (Gly90, Leu91, Leu93, Ile94, Gly97, and Val98), defining a minimum activity motif of 90GLXLIXXGV98 within TM4. When the propensity of these mutants to dimerize on SDS-PAGE was examined, replacements of all but Ile resulted in ∼2-fold reduction of dimerization versus the wild-type antiporter. Our work defines a minimum activity motif of 90GLXLIXXGV98 within TM4 and suggests that this sequence mediates TM4-based SMR dimerization along a single helix surface, stabilized by a small residue heptad repeat sequence. These TM4-TM4 interactions likely constitute the highest affinity locus for disruption of SMR function by directly targeting its self-assembly mechanism. PMID:19224913

  12. Functional Motifs in Biochemical Reaction Networks

    PubMed Central

    Tyson, John J.; Novák, Béla

    2013-01-01

    The signal-response characteristics of a living cell are determined by complex networks of interacting genes, proteins, and metabolites. Understanding how cells respond to specific challenges, how these responses are contravened in diseased cells, and how to intervene pharmacologically in the decision-making processes of cells requires an accurate theory of the information-processing capabilities of macromolecular regulatory networks. Adopting an engineer’s approach to control systems, we ask whether realistic cellular control networks can be decomposed into simple regulatory motifs that carry out specific functions in a cell. We show that such functional motifs exist and review the experimental evidence that they control cellular responses as expected. PMID:20055671

  13. On the Kernelization Complexity of Colorful Motifs

    NASA Astrophysics Data System (ADS)

    Ambalath, Abhimanyu M.; Balasundaram, Radheshyam; Rao H., Chintan; Koppula, Venkata; Misra, Neeldhara; Philip, Geevarghese; Ramanujan, M. S.

    The Colorful Motif problem asks if, given a vertex-colored graph G, there exists a subset S of vertices of G such that the graph induced by G on S is connected and contains every color in the graph exactly once. The problem is motivated by applications in computational biology and is also well-studied from the theoretical point of view. In particular, it is known to be NP-complete even on trees of maximum degree three [Fellows et al, ICALP 2007]. In their pioneering paper that introduced the color-coding technique, Alon et al. [STOC 1995] show, inter alia, that the problem is FPT on general graphs. More recently, Cygan et al. [WG 2010] showed that Colorful Motif is NP-complete on comb graphs, a special subclass of the set of trees of maximum degree three. They also showed that the problem is not likely to admit polynomial kernels on forests.

  14. Sequential motif profile of natural visibility graphs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-11-01

    The concept of sequential visibility graph motifs—subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series—has been advanced recently along with a theoretical framework to compute analytically the motif profiles associated to horizontal visibility graphs (HVGs). Here we develop a theory to compute the profile of sequential visibility graph motifs in the context of natural visibility graphs (VGs). This theory gives exact results for deterministic aperiodic processes with a smooth invariant density or stochastic processes that fulfill the Markov property and have a continuous marginal distribution. The framework also allows for a linear time numerical estimation in the case of empirical time series. A comparison between the HVG and the VG case (including evaluation of their robustness for short series polluted with measurement noise) is also presented.

  15. Chiral Alkyl Halides: Underexplored Motifs in Medicine

    PubMed Central

    Gál, Bálint; Bucher, Cyril; Burns, Noah Z.

    2016-01-01

    While alkyl halides are valuable intermediates in synthetic organic chemistry, their use as bioactive motifs in drug discovery and medicinal chemistry is rare in comparison. This is likely attributable to the common misconception that these compounds are merely non-specific alkylators in biological systems. A number of chlorinated compounds in the pharmaceutical and food industries, as well as a growing number of halogenated marine natural products showing unique bioactivity, illustrate the role that chiral alkyl halides can play in drug discovery. Through a series of case studies, we demonstrate in this review that these motifs can indeed be stable under physiological conditions, and that halogenation can enhance bioactivity through both steric and electronic effects. Our hope is that, by placing such compounds in the minds of the chemical community, they may gain more traction in drug discovery and inspire more synthetic chemists to develop methods for selective halogenation. PMID:27827902

  16. Anticipated synchronization in neuronal network motifs

    NASA Astrophysics Data System (ADS)

    Matias, F. S.; Gollo, L. L.; Carelli, P. V.; Copelli, M.; Mirasso, C. R.

    2013-01-01

    Two identical dynamical systems coupled unidirectionally (in a so called master-slave configuration) exhibit anticipated synchronization (AS) if the one which receives the coupling (the slave) also receives a negative delayed self-feedback. In oscillatory neuronal systems AS is characterized by a phase-locking with negative time delay τ between the spikes of the master and of the slave (slave fires before the master), while in the usual delayed synchronization (DS) regime τ is positive (slave fires after the master). A 3-neuron motif in which the slave self-feedback is replaced by a feedback loop mediated by an interneuron can exhibits both AS and DS regimes. Here we show that AS is robust in the presence of noise in a 3 Hodgkin-Huxley type neuronal motif. We also show that AS is stable for large values of τ in a chain of connected slaves-interneurons.

  17. Analyzing network reliability using structural motifs.

    PubMed

    Khorramzadeh, Yasamin; Youssef, Mina; Eubank, Stephen; Mowlaei, Shahir

    2015-04-01

    This paper uses the reliability polynomial, introduced by Moore and Shannon in 1956, to analyze the effect of network structure on diffusive dynamics such as the spread of infectious disease. We exhibit a representation for the reliability polynomial in terms of what we call structural motifs that is well suited for reasoning about the effect of a network's structural properties on diffusion across the network. We illustrate by deriving several general results relating graph structure to dynamical phenomena.

  18. Dynamic motifs in socio-economic networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

    2014-12-01

    Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.

  19. An E-box/M-CAT hybrid motif and cognate binding protein(s) regulate the basal muscle-specific and cAMP-inducible expression of the rat cardiac alpha-myosin heavy chain gene.

    PubMed

    Gupta, M P; Gupta, M; Zak, R

    1994-11-25

    Expression of the cardiac myosin heavy chain (MHC) genes is regulated developmentally and by numerous epigenetic factors. Here we report the identification of a cis-regulatory element and cognate nuclear binding protein(s) responsible for cAMP-induced expression of the rat cardiac alpha-MHC gene. By Northern blot analysis, we found that, in primary cultures of fetal rat heart myocytes, the elevation of intracellular levels of cAMP results in up-regulation of alpha-MHC and down-regulation of beta-MHC mRNA expression. This effect of cAMP was dependent upon the basal level of expression of both MHC transcripts and was sensitive to cycloheximide. In transient expression analysis employing a series of alpha-MHC/CAT constructs, we identified a 31-base pair fragment located in the immediate upstream region (-71 to -40), which confers both muscle-specific and cAMP-inducible expression of the gene. Within this 31-base pair fragment there are two regions, an AT-rich portion and a hybrid motif which contains overlapping sequences of E-box and M-CAT binding sites (GGCACGTGGAATG). By substitution mutation analysis, both elements were found important for the basal muscle-specific expression; however, the cAMP-inducible expression of the gene is conferred only by the E-box/M-CAT hybrid motif (EM element). Using mobility gel shift competition assay, immunoblotting, and UV-cross-linking analyses, we found that a protein binding to the EM element is indistinguishable from the transcription enhancer factor-1 (TEF-1) in terms of sequence recognition, molecular mass, and immunoreactivity. Methylation interference and point mutation analyses indicate that, besides M-CAT sequences, center CG dinucleotides of the E-box motif CACGTG are essential for protein binding to the EM element and for its functional activity. Furthermore, our data also show that, in addition to TEF-1, another HF-1a-related factor may be recognized by the alpha-MHC gene EM element. These results are first to

  20. Synchronization patterns: from network motifs to hierarchical networks

    NASA Astrophysics Data System (ADS)

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-01

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks. This article is part of the themed issue 'Horizons of cybernetical physics'.

  1. Synchronization patterns: from network motifs to hierarchical networks.

    PubMed

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-06

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks.This article is part of the themed issue 'Horizons of cybernetical physics'.

  2. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

  3. Alanine substitutions of noncysteine residues in the cysteine-stabilized αβ motif

    PubMed Central

    Yang, Ying-Fang; Cheng, Kuo-Chang; Tsai, Ping-Hsing; Liu, Chung-Cheng; Lee, Tian-Ren; Ping-Chiang Lyu

    2009-01-01

    The protein scaffold is a peptide framework with a high tolerance of residue modifications. The cysteine-stabilized αβ motif (CSαβ) consists of an α-helix and an antiparallel triple-stranded β-sheet connected by two disulfide bridges. Proteins containing this motif share low sequence identity but high structural similarity and has been suggested as a good scaffold for protein engineering. The Vigna radiate defensin 1 (VrD1), a plant defensin, serves here as a model protein to probe the amino acid tolerance of CSαβ motif. A systematic alanine substitution is performed on the VrD1. The key residues governing the inhibitory function and structure stability are monitored. Thirty-two of 46 residue positions of VrD1 are altered by site-directed mutagenesis techniques. The circular dichroism spectrum, intrinsic fluorescence spectrum, and chemical denaturation are used to analyze the conformation and structural stability of proteins. The secondary structures were highly tolerant to the amino acid substitutions; however, the protein stabilities were varied for each mutant. Many mutants, although they maintained their conformations, altered their inhibitory function significantly. In this study, we reported the first alanine scan on the plant defensin containing the CSαβ motif. The information is valuable to the scaffold with the CSαβ motif and protein engineering. PMID:19533758

  4. DNA motifs determining the efficiency of adaptation into the Escherichia coli CRISPR array.

    PubMed

    Yosef, Ido; Shitrit, Dror; Goren, Moran G; Burstein, David; Pupko, Tal; Qimron, Udi

    2013-08-27

    Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated proteins constitute a recently identified prokaryotic defense system against invading nucleic acids. DNA segments, termed protospacers, are integrated into the CRISPR array in a process called adaptation. Here, we establish a PCR-based assay that enables evaluating the adaptation efficiency of specific spacers into the type I-E Escherichia coli CRISPR array. Using this assay, we provide direct evidence that the protospacer adjacent motif along with the first base of the protospacer (5'-AAG) partially affect the efficiency of spacer acquisition. Remarkably, we identified a unique dinucleotide, 5'-AA, positioned at the 3' end of the spacer, that enhances efficiency of the spacer's acquisition. Insertion of this dinucleotide increased acquisition efficiency of two different spacers. DNA sequencing of newly adapted CRISPR arrays revealed that the position of the newly identified motif with respect to the 5'-AAG is important for affecting acquisition efficiency. Analysis of approximately 1 million spacers showed that this motif is overrepresented in frequently acquired spacers compared with those acquired rarely. Our results represent an example of a short nonprotospacer adjacent motif sequence that affects acquisition efficiency and suggest that other as yet unknown motifs affect acquisition efficiency in other CRISPR systems as well.

  5. CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design

    PubMed Central

    Chen, Yong

    2016-01-01

    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html. PMID:27487245

  6. RNA structural motif recognition based on least-squares distance.

    PubMed

    Shen, Ying; Wong, Hau-San; Zhang, Shaohong; Zhang, Lin

    2013-09-01

    RNA structural motifs are recurrent structural elements occurring in RNA molecules. RNA structural motif recognition aims to find RNA substructures that are similar to a query motif, and it is important for RNA structure analysis and RNA function prediction. In view of this, we propose a new method known as RNA Structural Motif Recognition based on Least-Squares distance (LS-RSMR) to effectively recognize RNA structural motifs. A test set consisting of five types of RNA structural motifs occurring in Escherichia coli ribosomal RNA is compiled by us. Experiments are conducted for recognizing these five types of motifs. The experimental results fully reveal the superiority of the proposed LS-RSMR compared with four other state-of-the-art methods.

  7. Potential Direct Regulators of the Drosophila yellow Gene Identified by Yeast One-Hybrid and RNAi Screens

    PubMed Central

    Kalay, Gizem; Lusk, Richard; Dome, Mackenzie; Hens, Korneel; Deplancke, Bart; Wittkopp, Patricia J.

    2016-01-01

    The regulation of gene expression controls development, and changes in this regulation often contribute to phenotypic evolution. Drosophila pigmentation is a model system for studying evolutionary changes in gene regulation, with differences in expression of pigmentation genes such as yellow that correlate with divergent pigment patterns among species shown to be caused by changes in cis- and trans-regulation. Currently, much more is known about the cis-regulatory component of divergent yellow expression than the trans-regulatory component, in part because very few trans-acting regulators of yellow expression have been identified. This study aims to improve our understanding of the trans-acting control of yellow expression by combining yeast-one-hybrid and RNAi screens for transcription factors binding to yellow cis-regulatory sequences and affecting abdominal pigmentation in adults, respectively. Of the 670 transcription factors included in the yeast-one-hybrid screen, 45 showed evidence of binding to one or more sequence fragments tested from the 5′ intergenic and intronic yellow sequences from D. melanogaster, D. pseudoobscura, and D. willistoni, suggesting that they might be direct regulators of yellow expression. Of the 670 transcription factors included in the yeast-one-hybrid screen, plus another TF previously shown to be genetically upstream of yellow, 125 were also tested using RNAi, and 32 showed altered abdominal pigmentation. Nine transcription factors were identified in both screens, including four nuclear receptors related to ecdysone signaling (Hr78, Hr38, Hr46, and Eip78C). This finding suggests that yellow expression might be directly controlled by nuclear receptors influenced by ecdysone during early pupal development when adult pigmentation is forming. PMID:27527791

  8. Chaotic motif sampler: detecting motifs from biological sequences by using chaotic neurodynamics

    NASA Astrophysics Data System (ADS)

    Matsuura, Takafumi; Ikeguchi, Tohru

    Identification of a region in biological sequences, motif extraction problem (MEP) is solved in bioinformatics. However, the MEP is an NP-hard problem. Therefore, it is almost impossible to obtain an optimal solution within a reasonable time frame. To find near optimal solutions for NP-hard combinatorial optimization problems such as traveling salesman problems, quadratic assignment problems, and vehicle routing problems, chaotic search, which is one of the deterministic approaches, has been proposed and exhibits better performance than stochastic approaches. In this paper, we propose a new alignment method that employs chaotic dynamics to solve the MEPs. It is called the Chaotic Motif Sampler. We show that the performance of the Chaotic Motif Sampler is considerably better than that of the conventional methods such as the Gibbs Site Sampler and the Neighborhood Optimization for Multiple Alignment Discovery.

  9. EAR motif-mediated transcriptional repression in plants: an underlying mechanism for epigenetic regulation of gene expression.

    PubMed

    Kagale, Sateesh; Rozwadowski, Kevin

    2011-02-01

    Ethylene-responsive element binding factor-associated Amphiphilic Repression (EAR) motif-mediated transcriptional repression is emerging as one of the principal mechanisms of plant gene regulation. The EAR motif, defined by the consensus sequence patterns of either LxLxL or DLNxxP, is the most predominant form of transcriptional repression motif so far identified in plants. Additionally, this active repression motif is highly conserved in transcriptional regulators known to function as negative regulators in a broad range of developmental and physiological processes across evolutionarily diverse plant species. Recent discoveries of co-repressors interacting with EAR motifs, such as TOPLESS (TPL) and AtSAP18, have begun to unravel the mechanisms of EAR motif-mediated repression. The demonstration of genetic interaction between mutants of TPL and AtHDA19, co-complex formation between TPL-related 1 (TPR1) and AtHDA19, as well as direct physical interaction between AtSAP18 and AtHDA19 support a model where EAR repressors, via recruitment of chromatin remodeling factors, facilitate epigenetic regulation of gene expression. Here, we discuss the biological significance of EAR-mediated gene regulation in the broader context of plant biology and present literature evidence in support of a model for EAR motif-mediated repression via the recruitment and action of chromatin modifiers. Additionally, we discuss the possible influences of phosphorylation and ubiquitination on the function and turnover of EAR repressors.

  10. Bases of motifs for generating repeated patterns with wild cards.

    PubMed

    Pisanti, Nadia; Crochemore, Maxime; Grossi, Roberto; Sagot, Marie-France

    2005-01-01

    Motif inference represents one of the most important areas of research in computational biology, and one of its oldest ones. Despite this, the problem remains very much open in the sense that no existing definition is fully satisfying, either in formal terms, or in relation to the biological questions that involve finding such motifs. Two main types of motifs have been considered in the literature: matrices (of letter frequency per position in the motif) and patterns. There is no conclusive evidence in favor of either, and recent work has attempted to integrate the two types into a single model. In this paper, we address the formal issue in relation to motifs as patterns. This is essential to get at a better understanding of motifs in general. In particular, we consider a promising idea that was recently proposed, which attempted to avoid the combinatorial explosion in the number of motifs by means of a generator set for the motifs. Instead of exhibiting a complete list of motifs satisfying some input constraints, what is produced is a basis of such motifs from which all the other ones can be generated. We study the computational cost of determining such a basis of repeated motifs with wild cards in a sequence. We give new upper and lower bounds on such a cost, introducing a notion of basis that is provably contained in (and, thus, smaller) than previously defined ones. Our basis can be computed in less time and space, and is still able to generate the same set of motifs. We also prove that the number of motifs in all bases defined so far grows exponentially with the quorum, that is, with the minimal number of times a motif must appear in a sequence, something unnoticed in previous work. We show that there is no hope to efficiently compute such bases unless the quorum is fixed.

  11. MINER: software for phylogenetic motif identification.

    PubMed

    La, David; Livesay, Dennis R

    2005-07-01

    MINER is web-based software for phylogenetic motif (PM) identification. PMs are sequence regions (fragments) that conserve the overall familial phylogeny. PMs have been shown to correspond to a wide variety of catalytic regions, substrate-binding sites and protein interfaces, making them ideal functional site predictions. The MINER output provides an intuitive interface for interactive PM sequence analysis and structural visualization. The web implementation of MINER is freely available at http://www.pmap.csupomona.edu/MINER/. Source code is available to the academic community on request.

  12. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme-pRNA intermediate formation.

    PubMed

    Neubauer, Julie; Ogino, Minako; Green, Todd J; Ogino, Tomoaki

    2016-01-08

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3-8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop-start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs.

  13. Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme–pRNA intermediate formation

    PubMed Central

    Neubauer, Julie; Ogino, Minako; Green, Todd J.; Ogino, Tomoaki

    2016-01-01

    The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3–8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop–start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs. PMID:26602696

  14. Transcription factor motif quality assessment requires systematic comparative analysis

    PubMed Central

    Kibet, Caleb Kipkurui; Machanick, Philip

    2016-01-01

    Transcription factor (TF) binding site prediction remains a challenge in gene regulatory research due to degeneracy and potential variability in binding sites in the genome. Dozens of algorithms designed to learn binding models (motifs) have generated many motifs available in research papers with a subset making it to databases like JASPAR, UniPROBE and Transfac. The presence of many versions of motifs from the various databases for a single TF and the lack of a standardized assessment technique makes it difficult for biologists to make an appropriate choice of binding model and for algorithm developers to benchmark, test and improve on their models. In this study, we review and evaluate the approaches in use, highlight differences and demonstrate the difficulty of defining a standardized motif assessment approach. We review scoring functions, motif length, test data and the type of performance metrics used in prior studies as some of the factors that influence the outcome of a motif assessment. We show that the scoring functions and statistics used in motif assessment influence ranking of motifs in a TF-specific manner. We also show that TF binding specificity can vary by source of genomic binding data. We also demonstrate that information content of a motif is not in isolation a measure of motif quality but is influenced by TF binding behaviour. We conclude that there is a need for an easy-to-use tool that presents all available evidence for a comparative analysis. PMID:27092243

  15. Promoter Motifs in NCLDVs: An Evolutionary Perspective.

    PubMed

    Oliveira, Graziele Pereira; Andrade, Ana Cláudia Dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen Dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

    2017-01-20

    For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses' evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters' evolutionary scenarios and propose the term "MEGA-box" to designate an ancestor promoter motif ('TATATAAAATTGA') that could be evolved gradually by nucleotides' gain and loss and point mutations.

  16. Promoter Motifs in NCLDVs: An Evolutionary Perspective

    PubMed Central

    Oliveira, Graziele Pereira; Andrade, Ana Cláudia dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

    2017-01-01

    For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses’ evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters’ evolutionary scenarios and propose the term “MEGA-box” to designate an ancestor promoter motif (‘TATATAAAATTGA’) that could be evolved gradually by nucleotides’ gain and loss and point mutations. PMID:28117683

  17. Accurate Quantification of microRNA via Single Strand Displacement Reaction on DNA Origami Motif

    PubMed Central

    Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can

    2013-01-01

    DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs. PMID:23990889

  18. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    PubMed

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  19. An Affinity Propagation-Based DNA Motif Discovery Algorithm.

    PubMed

    Sun, Chunxiao; Huo, Hongwei; Yu, Qiang; Guo, Haitao; Sun, Zhigang

    2015-01-01

    The planted (l, d) motif search (PMS) is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs) in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP) clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM) refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  20. Network Motifs: Simple Building Blocks of Complex Networks

    NASA Astrophysics Data System (ADS)

    Milo, R.; Shen-Orr, S.; Itzkovitz, S.; Kashtan, N.; Chklovskii, D.; Alon, U.

    2002-10-01

    Complex networks are studied across many fields of science. To uncover their structural design principles, we defined ``network motifs,'' patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks. We found such motifs in networks from biochemistry, neurobiology, ecology, and engineering. The motifs shared by ecological food webs were distinct from the motifs shared by the genetic networks of Escherichia coli and Saccharomyces cerevisiae or from those found in the World Wide Web. Similar motifs were found in networks that perform information processing, even though they describe elements as different as biomolecules within a cell and synaptic connections between neurons in Caenorhabditis elegans. Motifs may thus define universal classes of networks. This approach may uncover the basic building blocks of most networks.

  1. Detecting DNA regulatory motifs by incorporating positional trendsin information content

    SciTech Connect

    Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.; Eisen,Michael B.

    2004-05-04

    On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.

  2. Gibbs motif sampling: detection of bacterial outer membrane protein repeats.

    PubMed Central

    Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.

    1995-01-01

    The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488

  3. Discriminative motif analysis of high-throughput dataset

    PubMed Central

    Yao, Zizhen; MacQuarrie, Kyle L.; Fong, Abraham P.; Tapscott, Stephen J.; Ruzzo, Walter L.; Gentleman, Robert C.

    2014-01-01

    Motivation: High-throughput ChIP-seq studies typically identify thousands of peaks for a single transcription factor (TF). It is common for traditional motif discovery tools to predict motifs that are statistically significant against a naïve background distribution but are of questionable biological relevance. Results: We describe a simple yet effective algorithm for discovering differential motifs between two sequence datasets that is effective in eliminating systematic biases and scalable to large datasets. Tested on 207 ENCODE ChIP-seq datasets, our method identifies correct motifs in 78% of the datasets with known motifs, demonstrating improvement in both accuracy and efficiency compared with DREME, another state-of-art discriminative motif discovery tool. More interestingly, on the remaining more challenging datasets, we identify common technical or biological factors that compromise the motif search results and use advanced features of our tool to control for these factors. We also present case studies demonstrating the ability of our method to detect single base pair differences in DNA specificity of two similar TFs. Lastly, we demonstrate discovery of key TF motifs involved in tissue specification by examination of high-throughput DNase accessibility data. Availability: The motifRG package is publically available via the bioconductor repository. Contact: yzizhen@fhcrc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24162561

  4. Plasticity of the RNA Kink Turn Structural Motif

    SciTech Connect

    Antonioli, A.; Cochrane, J; Lipchock, S; Strobel, S

    2010-01-01

    The kink turn (K-turn) is an RNA structural motif found in many biologically significant RNAs. While most examples of the K-turn have a similar fold, the crystal structure of the Azoarcus group I intron revealed a novel RNA conformation, a reverse kink turn bent in the direction opposite that of a consensus K-turn. The reverse K-turn is bent toward the major grooves rather than the minor grooves of the flanking helices, yet the sequence differs from the K-turn consensus by only a single nucleotide. Here we demonstrate that the reverse bend direction is not solely defined by internal sequence elements, but is instead affected by structural elements external to the K-turn. It bends toward the major groove under the direction of a tetraloop-tetraloop receptor. The ability of one sequence to form two distinct structures demonstrates the inherent plasticity of the K-turn sequence. Such plasticity suggests that the K-turn is not a primary element in RNA folding, but instead is shaped by other structural elements within the RNA or ribonucleoprotein assembly.

  5. Polyproline and triple helix motifs in host-pathogen recognition.

    PubMed

    Berisio, Rita; Vitagliano, Luigi

    2012-12-01

    Secondary structure elements often mediate protein-protein interactions. Despite their low abundance in folded proteins, polyproline II (PPII) and its variant, the triple helix, are frequently involved in protein-protein interactions, likely due to their peculiar propensity to be solvent-exposed. We here review the role of PPII and triple helix in mediating hostpathogen interactions, with a particular emphasis to the structural aspects of these processes. After a brief description of the basic structural features of these elements, examples of host-pathogen interactions involving these motifs are illustrated. Literature data suggest that the role played by PPII motif in these processes is twofold. Indeed, PPII regions may directly mediate interactions between proteins of the host and the pathogen. Alternatively, PPII may act as structural spacers needed for the correct positioning of the elements needed for adhesion and infectivity. Recent investigations have highlighted that collagen triple helix is also a common target for bacterial adhesins. Although structural data on complexes between adhesins and collagen models are rather limited, experimental and theoretical studies have unveiled some interesting clues of the recognition process. Interestingly, very recent data show that not only is the triple helix used by pathogens as a target in the host-pathogen interaction but it may also act as a bait in these processes since bacterial proteins containing triple helix regions have been shown to interact with host proteins. As both PPII and triple helix expose several main chain non-satisfied hydrogen bond acceptors and donors, both elements are highly solvated. The preservation of the solvation state of both PPII and triple helix upon protein-protein interaction is an emerging aspect that will be here thoroughly discussed.

  6. An RNA motif that binds ATP

    NASA Technical Reports Server (NTRS)

    Sassanfar, M.; Szostak, J. W.

    1993-01-01

    RNAs that contain specific high-affinity binding sites for small molecule ligands immobilized on a solid support are present at a frequency of roughly one in 10(10)-10(11) in pools of random sequence RNA molecules. Here we describe a new in vitro selection procedure designed to ensure the isolation of RNAs that bind the ligand of interest in solution as well as on a solid support. We have used this method to isolate a remarkably small RNA motif that binds ATP, a substrate in numerous biological reactions and the universal biological high-energy intermediate. The selected ATP-binding RNAs contain a consensus sequence, embedded in a common secondary structure. The binding properties of ATP analogues and modified RNAs show that the binding interaction is characterized by a large number of close contacts between the ATP and RNA, and by a change in the conformation of the RNA.

  7. Complex lasso: new entangled motifs in proteins

    NASA Astrophysics Data System (ADS)

    Niemyska, Wanda; Dabrowski-Tumanski, Pawel; Kadlof, Michal; Haglund, Ellinor; Sułkowski, Piotr; Sulkowska, Joanna I.

    2016-11-01

    We identify new entangled motifs in proteins that we call complex lassos. Lassos arise in proteins with disulfide bridges (or in proteins with amide linkages), when termini of a protein backbone pierce through an auxiliary surface of minimal area, spanned on a covalent loop. We find that as much as 18% of all proteins with disulfide bridges in a non-redundant subset of PDB form complex lassos, and classify them into six distinct geometric classes, one of which resembles supercoiling known from DNA. Based on biological classification of proteins we find that lassos are much more common in viruses, plants and fungi than in other kingdoms of life. We also discuss how changes in the oxidation/reduction potential may affect the function of proteins with lassos. Lassos and associated surfaces of minimal area provide new, interesting and possessing many potential applications geometric characteristics not only of proteins, but also of other biomolecules.

  8. Complex lasso: new entangled motifs in proteins

    PubMed Central

    Niemyska, Wanda; Dabrowski-Tumanski, Pawel; Kadlof, Michal; Haglund, Ellinor; Sułkowski, Piotr; Sulkowska, Joanna I.

    2016-01-01

    We identify new entangled motifs in proteins that we call complex lassos. Lassos arise in proteins with disulfide bridges (or in proteins with amide linkages), when termini of a protein backbone pierce through an auxiliary surface of minimal area, spanned on a covalent loop. We find that as much as 18% of all proteins with disulfide bridges in a non-redundant subset of PDB form complex lassos, and classify them into six distinct geometric classes, one of which resembles supercoiling known from DNA. Based on biological classification of proteins we find that lassos are much more common in viruses, plants and fungi than in other kingdoms of life. We also discuss how changes in the oxidation/reduction potential may affect the function of proteins with lassos. Lassos and associated surfaces of minimal area provide new, interesting and possessing many potential applications geometric characteristics not only of proteins, but also of other biomolecules. PMID:27874096

  9. The Q Motif Is Involved in DNA Binding but Not ATP Binding in ChlR1 Helicase

    PubMed Central

    Ding, Hao; Guo, Manhong; Vidhyasagar, Venkatasubramanian; Talwar, Tanu; Wu, Yuliang

    2015-01-01

    Helicases are molecular motors that couple the energy of ATP hydrolysis to the unwinding of structured DNA or RNA and chromatin remodeling. The conversion of energy derived from ATP hydrolysis into unwinding and remodeling is coordinated by seven sequence motifs (I, Ia, II, III, IV, V, and VI). The Q motif, consisting of nine amino acids (GFXXPXPIQ) with an invariant glutamine (Q) residue, has been identified in some, but not all helicases. Compared to the seven well-recognized conserved helicase motifs, the role of the Q motif is less acknowledged. Mutations in the human ChlR1 (DDX11) gene are associated with a unique genetic disorder known as Warsaw Breakage Syndrome, which is characterized by cellular defects in genome maintenance. To examine the roles of the Q motif in ChlR1 helicase, we performed site directed mutagenesis of glutamine to alanine at residue 23 in the Q motif of ChlR1. ChlR1 recombinant protein was overexpressed and purified from HEK293T cells. ChlR1-Q23A mutant abolished the helicase activity of ChlR1 and displayed reduced DNA binding ability. The mutant showed impaired ATPase activity but normal ATP binding. A thermal shift assay revealed that ChlR1-Q23A has a melting point value similar to ChlR1-WT. Partial proteolysis mapping demonstrated that ChlR1-WT and Q23A have a similar globular structure, although some subtle conformational differences in these two proteins are evident. Finally, we found ChlR1 exists and functions as a monomer in solution, which is different from FANCJ, in which the Q motif is involved in protein dimerization. Taken together, our results suggest that the Q motif is involved in DNA binding but not ATP binding in ChlR1 helicase. PMID:26474416

  10. Cis-regulatory programs in the development and evolution of vertebrate paired appendages

    PubMed Central

    Gehrke, Andrew R.; Shubin, Neil H.

    2017-01-01

    Differential gene expression is the core of development, mediating the genetic changes necessary for determining cell identity. The regulation of gene activity by cis-acting elements (e.g., enhancers) is a crucial mechanism for determining differential gene activity by precise control of gene expression in embryonic space and time. Modifications to regulatory regions can have profound impacts on phenotype, and therefore developmental and evolutionary biologists have increasingly focused on elucidating the transcriptional control of genes that build and pattern body plans. Here, we trace the evolutionary history of transcriptional control of three loci key to vertebrate appendage development (Fgf8, Shh, and HoxD/A). Within and across these regulatory modules, we find both complex and flexible regulation in contrast with more fixed enhancers that appear unchanged over vast timescales of vertebrate evolution. The transcriptional control of vertebrate appendage development was likely already incredibly complex in the common ancestor of fish, implying that subtle changes to regulatory networks were more likely responsible for alterations in phenotype rather than the de novo addition of whole regulatory domains. Finally, we discuss the dangers of relying on inter-species transgenesis when testing enhancer function, and call for more controlled regulatory swap experiments when inferring the evolutionary history of enhancer elements. PMID:26783722

  11. Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome

    PubMed Central

    Naville, Magali; Ishibashi, Minaka; Ferg, Marco; Bengani, Hemant; Rinkwitz, Silke; Krecsmarik, Monika; Hawkins, Thomas A.; Wilson, Stephen W.; Manning, Elizabeth; Chilamakuri, Chandra S. R.; Wilson, David I.; Louis, Alexandra; Lucy Raymond, F.; Rastegar, Sepand; Strähle, Uwe; Lenhard, Boris; Bally-Cuif, Laure; van Heyningen, Veronica; FitzPatrick, David R.; Becker, Thomas S.; Roest Crollius, Hugues

    2015-01-01

    Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression. PMID:25908307

  12. The Interplay of cis-Regulatory Elements Rules Circadian Rhythms in Mouse Liver

    PubMed Central

    Korenčič, Anja; Bordyugov, Grigory; Košir, Rok; Rozman, Damjana; Goličnik, Marko; Herzel, Hanspeter

    2012-01-01

    The mammalian circadian clock is driven by cell-autonomous transcriptional feedback loops that involve E-boxes, D-boxes, and ROR-elements. In peripheral organs, circadian rhythms are additionally affected by systemic factors. We show that intrinsic combinatorial gene regulation governs the liver clock. With a temporal resolution of 2 h, we measured the expression of 21 clock genes in mouse liver under constant darkness and equinoctial light-dark cycles. Based on these data and known transcription factor binding sites, we develop a six-variable gene regulatory network. The transcriptional feedback loops are represented by equations with time-delayed variables, which substantially simplifies modelling of intermediate protein dynamics. Our model accurately reproduces measured phases, amplitudes, and waveforms of clock genes. Analysis of the network reveals properties of the clock: overcritical delays generate oscillations; synergy of inhibition and activation enhances amplitudes; and combinatorial modulation of transcription controls the phases. The agreement of measurements and simulations suggests that the intrinsic gene regulatory network primarily determines the circadian clock in liver, whereas systemic cues such as light-dark cycles serve to fine-tune the rhythms. PMID:23144788

  13. A cis-Regulatory Mutation of PDSS2 Causes Silky-Feather in Chickens

    PubMed Central

    Feng, Chungang; Gao, Yu; Dorshorst, Ben; Song, Chi; Gu, Xiaorong; Li, Qingyuan; Li, Jinxiu; Liu, Tongxin; Rubin, Carl-Johan; Zhao, Yiqiang; Wang, Yanqiang; Fei, Jing; Li, Huifang; Chen, Kuanwei; Qu, Hao; Shu, Dingming; Ashwell, Chris; Da, Yang; Andersson, Leif; Hu, Xiaoxiang; Li, Ning

    2014-01-01

    Silky-feather has been selected and fixed in some breeds due to its unique appearance. This phenotype is caused by a single recessive gene (hookless, h). Here we map the silky-feather locus to chromosome 3 by linkage analysis and subsequently fine-map it to an 18.9 kb interval using the identical by descent (IBD) method. Further analysis reveals that a C to G transversion located upstream of the prenyl (decaprenyl) diphosphate synthase, subunit 2 (PDSS2) gene is causing silky-feather. All silky-feather birds are homozygous for the G allele. The silky-feather mutation significantly decreases the expression of PDSS2 during feather development in vivo. Consistent with the regulatory effect, the C to G transversion is shown to remarkably reduce PDSS2 promoter activity in vitro. We report a new example of feather structure variation associated with a spontaneous mutation and provide new insight into the PDSS2 function. PMID:25166907

  14. Exaptation of Transposable Elements into Novel Cis-Regulatory Elements: Is the Evidence Always Strong?

    PubMed Central

    de Souza, Flávio S.J.; Franchini, Lucía F.; Rubinstein, Marcelo

    2013-01-01

    Transposable elements (TEs) are mobile genetic sequences that can jump around the genome from one location to another, behaving as genomic parasites. TEs have been particularly effective in colonizing mammalian genomes, and such heavy TE load is expected to have conditioned genome evolution. Indeed, studies conducted both at the gene and genome levels have uncovered TE insertions that seem to have been co-opted—or exapted—by providing transcription factor binding sites (TFBSs) that serve as promoters and enhancers, leading to the hypothesis that TE exaptation is a major factor in the evolution of gene regulation. Here, we critically review the evidence for exaptation of TE-derived sequences as TFBSs, promoters, enhancers, and silencers/insulators both at the gene and genome levels. We classify the functional impact attributed to TE insertions into four categories of increasing complexity and argue that so far very few studies have conclusively demonstrated exaptation of TEs as transcriptional regulatory regions. We also contend that many genome-wide studies dealing with TE exaptation in recent lineages of mammals are still inconclusive and that the hypothesis of rapid transcriptional regulatory rewiring mediated by TE mobilization must be taken with caution. Finally, we suggest experimental approaches that may help attributing higher-order functions to candidate exapted TEs. PMID:23486611

  15. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences

    PubMed Central

    Hughes, Jim R.; Cheng, Jan-Fang; Ventress, Nicki; Prabhakar, Shyam; Clark, Kevin; Anguita, Eduardo; De Gobbi, Marco; de Jong, Pieter; Rubin, Eddy; Higgs, Douglas R.

    2005-01-01

    An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of ≈238kb containing the human α globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs. PMID:15998734

  16. Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

    SciTech Connect

    Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis; Rubin, EdwardM.; Couronne, Olivier

    2005-06-13

    Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A total of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.

  17. Evolving New Skeletal Traits by cis-Regulatory Changes in Bone Morphogenetic Proteins

    PubMed Central

    Indjeian, Vahan B.; Kingman, Garrett A.; Jones, Felicity C.; Guenther, Catherine A.; Grimwood, Jane; Schmutz, Jeremy; Myers, Richard M.; Kingsley, David M.

    2016-01-01

    SUMMARY Changes in bone size and shape are defining features of many vertebrates. Here we use genetic crosses and comparative genomics to identify specific regulatory DNA alterations controlling skeletal evolution. Armor bone size differences in sticklebacks maps to a major effect locus overlapping BMP family member GDF6. Freshwater fish express more GDF6 due in part to a transposon insertion, and transgenic overexpression of GDF6 phenocopies evolutionary changes in armor plate size. The human GDF6 locus also has undergone distinctive regulatory evolution, including complete loss of an enhancer that is otherwise highly conserved between chimps and other mammals. Functional tests show that the ancestral enhancer drives expression in hindlimbs but not forelimbs, in locations that have been specifically modified during the human transition to bipedalism. Both gain and loss of regulatory elements can localize BMP changes to specific anatomical locations, providing a flexible regulatory basis for evolving species-specific changes in skeletal form. PMID:26774823

  18. Characterization of "cis"-regulatory elements ("c"RE) associated with mammary gland function

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Bos taurus genome assembly has propelled dairy science into a new era; still, most of the information encoded in the genome has not yet been decoded. The human Encyclopedia of DNA Elements (ENCODE) project has spearheaded the identification and annotation of functional genomic elements in the hu...

  19. New cis-regulatory elements in the Rht-D1b locus region of wheat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fifteen gene-containing BACs with accumulated length of 1.82-Mb from the Rht-D1b locus region weresequenced and compared in detail with the orthologous regions of rice, sorghum, and maize. Our results show that Rht-D1b represents a conserved genomic region as implied by high gene sequence identity...

  20. Cellular microRNAs up-regulate transcription via interaction with promoter TATA-box motifs.

    PubMed

    Zhang, Yijun; Fan, Miaomiao; Zhang, Xue; Huang, Feng; Wu, Kang; Zhang, Junsong; Liu, Jun; Huang, Zhuoqiong; Luo, Haihua; Tao, Liang; Zhang, Hui

    2014-12-01

    The TATA box represents one of the most prevalent core promoters where the pre-initiation complexes (PICs) for gene transcription are assembled. This assembly is crucial for transcription initiation and well regulated. Here we show that some cellular microRNAs (miRNAs) are associated with RNA polymerase II (Pol II) and TATA box-binding protein (TBP) in human peripheral blood mononuclear cells (PBMCs). Among them, let-7i sequence specifically binds to the TATA-box motif of interleukin-2 (IL-2) gene and elevates IL-2 mRNA and protein production in CD4(+) T-lymphocytes in vitro and in vivo. Through direct interaction with the TATA-box motif, let-7i facilitates the PIC assembly and transcription initiation of IL-2 promoter. Several other cellular miRNAs, such as mir-138, mir-92a or mir-181d, also enhance the promoter activities via binding to the TATA-box motifs of insulin, calcitonin or c-myc, respectively. In agreement with the finding that an HIV-1-encoded miRNA could enhance viral replication through targeting the viral promoter TATA-box motif, our data demonstrate that the interaction with core transcription machinery is a novel mechanism for miRNAs to regulate gene expression.

  1. Crystal structure of SEL1L: Insight into the roles of SLR motifs in ERAD pathway

    PubMed Central

    Jeong, Hanbin; Sim, Hyo Jung; Song, Eun Kyung; Lee, Hakbong; Ha, Sung Chul; Jun, Youngsoo; Park, Tae Joo; Lee, Changwook

    2016-01-01

    Terminally misfolded proteins are selectively recognized and cleared by the endoplasmic reticulum-associated degradation (ERAD) pathway. SEL1L, a component of the ERAD machinery, plays an important role in selecting and transporting ERAD substrates for degradation. We have determined the crystal structure of the mouse SEL1L central domain comprising five Sel1-Like Repeats (SLR motifs 5 to 9; hereafter called SEL1Lcent). Strikingly, SEL1Lcent forms a homodimer with two-fold symmetry in a head-to-tail manner. Particularly, the SLR motif 9 plays an important role in dimer formation by adopting a domain-swapped structure and providing an extensive dimeric interface. We identified that the full-length SEL1L forms a self-oligomer through the SEL1Lcent domain in mammalian cells. Furthermore, we discovered that the SLR-C, comprising SLR motifs 10 and 11, of SEL1L directly interacts with the N-terminus luminal loops of HRD1. Therefore, we propose that certain SLR motifs of SEL1L play a unique role in membrane bound ERAD machinery. PMID:27064360

  2. MADMX: a strategy for maximal dense motif extraction.

    PubMed

    Grossi, Roberto; Pietracaprina, Andrea; Pisanti, Nadia; Pucci, Geppino; Upfal, Eli; Vandin, Fabio

    2011-04-01

    We develop, analyze, and experiment with a new tool, called MADMX, which extracts frequent motifs from biological sequences. We introduce the notion of density to single out the "significant" motifs. The density is a simple and flexible measure for bounding the number of don't cares in a motif, defined as the fraction of solid (i.e., different from don't care) characters in the motif. A maximal dense motif has density above a certain threshold, and any further specialization of a don't care symbol in it or any extension of its boundaries decreases its number of occurrences in the input sequence. By extracting only maximal dense motifs, MADMX reduces the output size and improves performance, while enhancing the quality of the discoveries. The efficiency of our approach relies on a newly defined combining operation, dubbed fusion, which allows for the construction of maximal dense motifs in a bottom-up fashion, while avoiding the generation of nonmaximal ones. We provide experimental evidence of the efficiency and the quality of the motifs returned by MADMX.

  3. DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL VIEW, MAIN ENTRANCE GATES, SHOWING A WINGED HOURGLASS MOTIF, WHICH REFERS TO THE QUICK PASSAGE OF TIME AND THE SHORTNESS OF HUMAN LIFE. USE OF THIS MOTIF WAS A CARRYOVER FROM THE MCARTHUR GATES. - Woodlands Cemetery, 4000 Woodlands Avenue, Philadelphia, Philadelphia County, PA

  4. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  5. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  6. Single promoters as regulatory network motifs

    NASA Astrophysics Data System (ADS)

    Zopf, Christopher; Maheshri, Narendra

    2012-02-01

    At eukaryotic promoters, chromatin can influence the relationship between a gene's expression and transcription factor (TF) activity. This additional complexity might allow single promoters to exhibit dynamical behavior commonly attributed to regulatory motifs involving multiple genes. We investigate the role of promoter chromatin architecture in the kinetics of gene activation using a previously described set of promoter variants based on the phosphate-regulated PHO5 promoter in S. cerevisiae. Accurate quantitative measurement of transcription activation kinetics is facilitated by a controllable and observable TF input to a promoter of interest leading to an observable expression output in single cells. We find the particular architecture of these promoters can result in a significant delay in activation, filtering of noisy TF signals, and a memory of previous activation -- dynamical behaviors reminiscent of a feed-forward loop but only requiring a single promoter. We suggest this is a consequence of chromatin transactions at the promoter, likely passing through a long-lived ``primed'' state between its inactive and competent states. Finally, we show our experimental setup can be generalized as a ``gene oscilloscope'' to probe the kinetics of heterologous promoter architectures.

  7. The TRTGn motif stabilizes the transcription initiation open complex.

    PubMed

    Voskuil, Martin I; Chambliss, Glenn H

    2002-09-20

    The effect on transcription initiation by the extended -10 motif (5'-TRTG(n)-3'), positioned upstream of the -10 region, was investigated using a series of base substitution mutations in the alpha-amylase promoter (amyP). The extended -10 motif, previously referred to as the -16 region, is found frequently in Gram-positive bacterial promoters and several extended -10 promoters from Escherichia coli. The inhibitory effects of the non-productive promoter site (amyP2), which overlaps the upstream region of amyP, were eliminated by mutagenesis of the -35 region and the TRTG motif of amyP2. Removal by mutagenesis of the competitive effects of amyP2 resulted in a reduced dependence of amyP on the TRTG motif. In the absence of the second promoter, mutations in the TRTG motif of amyP destabilized the open complex and prevented the maintenance of open complexes at low temperatures. The open complex half-life was up to 26-fold shorter in the mutant TRTG motif promoters than in the wild-type promoter. We demonstrate that the amyP TRTG motif dramatically stabilizes the open complex intermediate during transcription initiation. Even though the open complex is less stable in the mutant promoters, the region of melted DNA is the same in the wild-type and mutant promoters. However, upon addition of the first three nucleotides, which trap RNAP (RNA polymerase) in a stable initiating complex, the melted DNA region contracts at the 5'-end in a TRTG motif promoter mutant but not at the wild-type promoter, indicating that the motif contributes to maintaining DNA-strand separation.

  8. Automated motif extraction and classification in RNA tertiary structures

    PubMed Central

    Djelloul, Mahassine; Denise, Alain

    2008-01-01

    We used a novel graph-based approach to extract RNA tertiary motifs. We cataloged them all and clustered them using an innovative graph similarity measure. We applied our method to three widely studied structures: Haloarcula marismortui 50S (H.m 50S), Escherichia coli 50S (E. coli 50S), and Thermus thermophilus 16S (T.th 16S) RNAs. We identified 10 known motifs without any prior knowledge of their shapes or positions. We additionally identified four putative new motifs. PMID:18957493

  9. Coherent feedforward transcriptional regulatory motifs enhance drug resistance

    NASA Astrophysics Data System (ADS)

    Charlebois, Daniel A.; Balázsi, Gábor; Kærn, Mads

    2014-05-01

    Fluctuations in gene expression give identical cells access to a spectrum of phenotypes that can serve as a transient, nongenetic basis for natural selection by temporarily increasing drug resistance. In this study, we demonstrate using mathematical modeling and simulation that certain gene regulatory network motifs, specifically coherent feedforward loop motifs, can facilitate the development of nongenetic resistance by increasing cell-to-cell variability and the time scale at which beneficial phenotypic states can be maintained. Our results highlight how regulatory network motifs enabling transient, nongenetic inheritance play an important role in defining reproductive fitness in adverse environments and provide a selective advantage subject to evolutionary pressure.

  10. Seeing the B-A-C-H motif

    NASA Astrophysics Data System (ADS)

    Catravas, Palmyra

    2005-09-01

    Musical compositions can be thought of as complex, multidimensional data sets. Compositions based on the B-A-C-H motif (a four-note motif of the pitches of the last name of Johann Sebastian Bach) span several centuries of evolving compositional styles and provide an intriguing set for analysis since they contain a common feature, the motif, buried in dissimilar contexts. We will present analyses which highlight the content of this unusual set of pieces, with emphasis on visual display of information.

  11. Targeting functional motifs of a protein family

    NASA Astrophysics Data System (ADS)

    Bhadola, Pradeep; Deo, Nivedita

    2016-10-01

    The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β -lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β -lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β -lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.

  12. Targeting functional motifs of a protein family.

    PubMed

    Bhadola, Pradeep; Deo, Nivedita

    2016-10-01

    The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β-lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β-lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β-lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.

  13. Crammed signaling motifs in the T-cell receptor.

    PubMed

    Borroto, Aldo; Abia, David; Alarcón, Balbino

    2014-09-01

    Although the T cell antigen receptor (TCR) is long known to contain multiple signaling subunits (CD3γ, CD3δ, CD3ɛ and CD3ζ), their role in signal transduction is still not well understood. The presence of at least one immunoreceptor tyrosine-based activation motif (ITAM) in each CD3 subunit has led to the idea that the multiplication of such elements essentially serves to amplify signals. However, the evolutionary conservation of non-ITAM sequences suggests that each CD3 subunit is likely to have specific non-redundant roles at some stage of development or in mature T cell function. The CD3ɛ subunit is paradigmatic because in a relatively short cytoplasmic sequence (∼55 amino acids) it contains several docking sites for proteins involved in intracellular trafficking and signaling, proteins whose relevance in T cell activation is slowly starting to be revealed. In this review we will summarize our current knowledge on the signaling effectors that bind directly to the TCR and we will propose a hierarchy in their response to TCR triggering.

  14. Emergence of connectivity motifs in networks of model neurons with short- and long-term plastic synapses.

    PubMed

    Vasilaki, Eleni; Giugliano, Michele

    2014-01-01

    Recent experimental data from the rodent cerebral cortex and olfactory bulb indicate that specific connectivity motifs are correlated with short-term dynamics of excitatory synaptic transmission. It was observed that neurons with short-term facilitating synapses form predominantly reciprocal pairwise connections, while neurons with short-term depressing synapses form predominantly unidirectional pairwise connections. The cause of these structural differences in excitatory synaptic microcircuits is unknown. We show that these connectivity motifs emerge in networks of model neurons, from the interactions between short-term synaptic dynamics (SD) and long-term spike-timing dependent plasticity (STDP). While the impact of STDP on SD was shown in simultaneous neuronal pair recordings in vitro, the mutual interactions between STDP and SD in large networks are still the subject of intense research. Our approach combines an SD phenomenological model with an STDP model that faithfully captures long-term plasticity dependence on both spike times and frequency. As a proof of concept, we first simulate and analyze recurrent networks of spiking neurons with random initial connection efficacies and where synapses are either all short-term facilitating or all depressing. For identical external inputs to the network, and as a direct consequence of internally generated activity, we find that networks with depressing synapses evolve unidirectional connectivity motifs, while networks with facilitating synapses evolve reciprocal connectivity motifs. We then show that the same results hold for heterogeneous networks, including both facilitating and depressing synapses. This does not contradict a recent theory that proposes that motifs are shaped by external inputs, but rather complements it by examining the role of both the external inputs and the internally generated network activity. Our study highlights the conditions under which SD-STDP might explain the correlation between

  15. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

    PubMed Central

    Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

    2016-01-01

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C. PMID:26782977

  16. A Common Structural Motif in the Binding of Virulence Factors to Bacterial Secretion Chaperones

    SciTech Connect

    Lilic,M.; Vujanac, M.; Stebbins, C.

    2006-01-01

    Salmonella invasion protein A (SipA) is translocated into host cells by a type III secretion system (T3SS) and comprises two regions: one domain binds its cognate type III secretion chaperone, InvB, in the bacterium to facilitate translocation, while a second domain functions in the host cell, contributing to bacterial uptake by polymerizing actin. We present here the crystal structures of the SipA chaperone binding domain (CBD) alone and in complex with InvB. The SipA CBD is found to consist of a nonglobular polypeptide as well as a large globular domain, both of which are necessary for binding to InvB. We also identify a structural motif that may direct virulence factors to their cognate chaperones in a diverse range of pathogenic bacteria. Disruption of this structural motif leads to a destabilization of several chaperone-substrate complexes from different species, as well as an impairment of secretion in Salmonella.

  17. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

    NASA Astrophysics Data System (ADS)

    Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

    2016-01-01

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C.

  18. A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery

    PubMed Central

    Yen, Ian E. H.; Lin, Xin; Zhang, Jiong; Ravikumar, Pradeep; Dhillon, Inderjit S.

    2016-01-01

    Multiple Sequence Alignment and Motif Discovery, known as NP-hard problems, are two fundamental tasks in Bioinformatics. Existing approaches to these two problems are based on either local search methods such as Expectation Maximization (EM), Gibbs Sampling or greedy heuristic methods. In this work, we develop a convex relaxation approach to both problems based on the recent concept of atomic norm and develop a new algorithm, termed Greedy Direction Method of Multiplier, for solving the convex relaxation with two convex atomic constraints. Experiments show that our convex relaxation approach produces solutions of higher quality than those standard tools widely-used in Bioinformatics community on the Multiple Sequence Alignment and Motif Discovery problems. PMID:27559428

  19. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis.

    PubMed

    Wang, Chunyan; Bae, Jin H; Zhang, David Yu

    2016-01-19

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C.

  20. A million peptide motifs for the molecular biologist.

    PubMed

    Tompa, Peter; Davey, Norman E; Gibson, Toby J; Babu, M Madan

    2014-07-17

    A molecular description of functional modules in the cell is the focus of many high-throughput studies in the postgenomic era. A large portion of biomolecular interactions in virtually all cellular processes is mediated by compact interaction modules, referred to as peptide motifs. Such motifs are typically less than ten residues in length, occur within intrinsically disordered regions, and are recognized and/or posttranslationally modified by structured domains of the interacting partner. In this review, we suggest that there might be over a million instances of peptide motifs in the human proteome. While this staggering number suggests that peptide motifs are numerous and the most understudied functional module in the cell, it also holds great opportunities for new discoveries.

  1. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES. - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  2. 10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    10. DETAIL OF CORNICE MOULDING WITH RAM'S HEAD MOTIF. EIGHT SHADES OF GOLD LEAF AND BURNISHED GOLD LEAF WERE USED FOR THE INTERIOR FINISHES - Anaconda Historic District, Washoe Theater, 305 Main Street, Anaconda, Deer Lodge County, MT

  3. Motif-Synchronization: A new method for analysis of dynamic brain networks with EEG

    NASA Astrophysics Data System (ADS)

    Rosário, R. S.; Cardoso, P. T.; Muñoz, M. A.; Montoya, P.; Miranda, J. G. V.

    2015-12-01

    The major aim of this work was to propose a new association method known as Motif-Synchronization. This method was developed to provide information about the synchronization degree and direction between two nodes of a network by counting the number of occurrences of some patterns between any two time series. The second objective of this work was to present a new methodology for the analysis of dynamic brain networks, by combining the Time-Varying Graph (TVG) method with a directional association method. We further applied the new algorithms to a set of human electroencephalogram (EEG) signals to perform a dynamic analysis of the brain functional networks (BFN).

  4. Three-Dimensional DNA Nanostructures Assembled from DNA Star Motifs.

    PubMed

    Tian, Cheng; Zhang, Chuan

    2017-01-01

    Tile-based DNA self-assembly is a promising method in DNA nanotechnology and has produced a wide range of nanostructures by using a small set of unique DNA strands. DNA star motif, as one of DNA tiles, has been employed to assemble varieties of symmetric one-, two-, three-dimensional (1, 2, 3D) DNA nanostructures. Herein, we describe the design principles, assembly methods, and characterization methods of 3D DNA nanostructures assembled from the DNA star motifs.

  5. Insight into the role of histidine in RNR motif of protein component of RNase P of M. tuberculosis in catalysis.

    PubMed

    Singh, Alla; Ramteke, Anup K; Afroz, Tariq; Batra, Janendra K

    2016-03-01

    RNase P, a ribonucleoprotein endoribonuclease, is involved in the 5' end processing of pre-tRNAs, with its RNA component being the catalytic subunit. It is an essential enzyme. All bacterial RNase Ps have one RNA and one protein component. A conserved RNR motif in bacterial RNase P protein components is involved in their interaction with the RNA component. In this work, we have reconstituted the RNase P of M. tuberculosis in vitro and investigated the role of a histidine in the RNR motif in its catalysis. We expressed the protein and RNA components of mycobacterial RNase P in E. coli, purified them, and reconstituted the holoenzyme in vitro. The histidine in RNR motif was mutated to alanine and asparagine by site-directed mutagenesis. The RNA component alone showed activity on pre-tRNA(ala) substrate at high magnesium concentrations. The RNA and protein components associated together to manifest catalytic activity at low magnesium concentrations. The histidine 67 in the RNR motif of M. tuberculosis RNase P protein component was found to be important for the catalytic activity and stability of the enzyme. Generally, the RNase P of M. tuberculosis functions like other bacterial enzymes. The histidine in the RNR motif of M. tuberculosis appears to be able to substitute optimally for asparagine found in the majority of the protein components of other bacterial RNase P enzymes.

  6. Feature extraction using gray-level co-occurrence matrix of wavelet coefficients and texture matching for batik motif recognition

    NASA Astrophysics Data System (ADS)

    Suciati, Nanik; Herumurti, Darlis; Wijaya, Arya Yudhi

    2017-02-01

    Batik is one of Indonesian's traditional cloth. Motif or pattern drawn on a piece of batik fabric has a specific name and philosopy. Although batik cloths are widely used in everyday life, but only few people understand its motif and philosophy. This research is intended to develop a batik motif recognition system which can be used to identify motif of Batik image automatically. First, a batik image is decomposed into sub-images using wavelet transform. Six texture descriptors, i.e. max probability, correlation, contrast, uniformity, homogenity and entropy, are extracted from gray-level co-occurrence matrix of each sub-image. The texture features are then matched to the template features using canberra distance. The experiment is performed on Batik Dataset consisting of 1088 batik images grouped into seven motifs. The best recognition rate, that is 92,1%, is achieved using feature extraction process with 5 level wavelet decomposition and 4 directional gray-level co-occurrence matrix.

  7. Finding specific RNA motifs: Function in a zeptomole world?

    PubMed Central

    KNIGHT, ROB; YARUS, MICHAEL

    2003-01-01

    We have developed a new method for estimating the abundance of any modular (piecewise) RNA motif within a longer random region. We have used this method to estimate the size of the active motifs available to modern SELEX experiments (picomoles of unique sequences) and to a plausible RNA World (zeptomoles of unique sequences: 1 zmole = 602 sequences). Unexpectedly, activities such as specific isoleucine binding are almost certainly present in zeptomoles of molecules, and even ribozymes such as self-cleavage motifs may appear (depending on assumptions about the minimal structures). The number of specified nucleotides is not the only important determinant of a motif’s rarity: The number of modules into which it is divided, and the details of this division, are also crucial. We propose three maxims for easily isolated motifs: the Maxim of Minimization, the Maxim of Multiplicity, and the Maxim of the Median. These maxims together state that selected motifs should be small and composed of as many separate, equally sized modules as possible. For evenly divided motifs with four modules, the largest accessible activity in picomole scale (1–1000 pmole) pools of length 100 is about 34 nucleotides; while for zeptomole scale (1–1000 zmole) pools it is about 20 specific nucleotides (50% probability of occurrence). This latter figure includes some ribozymes and aptamers. Consequently, an RNA metabolism apparently could have begun with only zeptomoles of RNA molecules. PMID:12554865

  8. Selection of peptide entry motifs by bacterial surface display.

    PubMed Central

    Taschner, Sabine; Meinke, Andreas; von Gabain, Alexander; Boyd, Aoife P

    2002-01-01

    Surface display technologies have been established previously to select peptides and polypeptides that interact with purified immobilized ligands. In the present study, we designed and implemented a surface display-based technique to identify novel peptide motifs that mediate entry into eukaryotic cells. An Escherichia coli library expressing surface-displayed peptides was combined with eukaryotic cells and the gentamicin protection assay was performed to select recombinant E. coli, which were internalized into eukaryotic cells by virtue of the displayed peptides. To establish the proof of principle of this approach, the fibronectin-binding motifs of the fibronectin-binding protein A of Staphylococcus aureus were inserted into the E. coli FhuA protein. Surface expression of the fusion proteins was demonstrated by functional assays and by FACS analysis. The fibronectin-binding motifs were shown to mediate entry of the bacteria into non-phagocytic eukaryotic cells and brought about the preferential selection of these bacteria over E. coli expressing parental FhuA, with an enrichment of 100000-fold. Four entry sequences were selected and identified using an S. aureus library of peptides displayed in the FhuA protein on the surface of E. coli. These sequences included novel entry motifs as well as integrin-binding Arg-Gly-Asp (RGD) motifs and promoted a high degree of bacterial entry. Bacterial surface display is thus a powerful tool to effectively select and identify entry peptide motifs. PMID:12144529

  9. Discovering Multidimensional Motifs in Physiological Signals for Personalized Healthcare.

    PubMed

    Balasubramanian, Arvind; Wang, Jun; Prabhakaran, Balakrishnan

    2016-08-01

    Personalized diagnosis and therapy requires monitoring patient activity using various body sensors. Sensor data generated during personalized exercises or tasks may be too specific or inadequate to be evaluated using supervised methods such as classification. We propose multidimensional motif (MDM) discovery as a means for patient activity monitoring, since such motifs can capture repeating patterns across multiple dimensions of the data, and can serve as conformance indicators. Previous studies pertaining to mining MDMs have proposed approaches that lack the capability of concurrently processing multiple dimensions, thus limiting their utility in online scenarios. In this paper, we propose an efficient real-time approach to MDM discovery in body sensor generated time series data for monitoring performance of patients during therapy. We present two alternative models for MDMs based on motif co-occurrences and temporal ordering among motifs across multiple dimensions, with detailed formulation of the concepts proposed. The proposed method uses an efficient hashing based record to enable speedy update and retrieval of motif sets, and identification of MDMs. Performance evaluation using synthetic and real body sensor data in unsupervised motif discovery tasks shows that the approach is effective for (a) concurrent processing of multidimensional time series information suitable for real-time applications, (b) finding unknown naturally occurring patterns with minimal delay, and

  10. cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).

  11. Transcriptional Network Growing Models Using Motif-Based Preferential Attachment.

    PubMed

    Abdelzaher, Ahmed F; Al-Musawi, Ahmad F; Ghosh, Preetam; Mayo, Michael L; Perkins, Edward J

    2015-01-01

    Understanding relationships between architectural properties of gene-regulatory networks (GRNs) has been one of the major goals in systems biology and bioinformatics, as it can provide insights into, e.g., disease dynamics and drug development. Such GRNs are characterized by their scale-free degree distributions and existence of network motifs - i.e., small-node subgraphs that occur more abundantly in GRNs than expected from chance alone. Because these transcriptional modules represent "building blocks" of complex networks and exhibit a wide range of functional and dynamical properties, they may contribute to the remarkable robustness and dynamical stability associated with the whole of GRNs. Here, we developed network-construction models to better understand this relationship, which produce randomized GRNs by using transcriptional motifs as the fundamental growth unit in contrast to other methods that construct similar networks on a node-by-node basis. Because this model produces networks with a prescribed lower bound on the number of choice transcriptional motifs (e.g., downlinks, feed-forward loops), its fidelity to the motif distributions observed in model organisms represents an improvement over existing methods, which we validated by contrasting their resultant motif and degree distributions against existing network-growth models and data from the model organism of the bacterium Escherichia coli. These models may therefore serve as novel testbeds for further elucidating relationships between the topology of transcriptional motifs and network-wide dynamical properties.

  12. The 3'-5' exonuclease site of DNA polymerase III from gram-positive bacteria: definition of a novel motif structure.

    PubMed

    Barnes, M H; Spacciapoli, P; Li, D H; Brown, N C

    1995-11-07

    The primary structure of the 3'-5' exonuclease (Exo) site of the Gram+ bacterial DNA polymerase III (Pol III) was examined by site-directed mutagenesis of Bacillus subtilis Pol III (BsPol III). It was found to differ significantly from the conventional three-motif substructure established for the Exo site of DNA polymerase I of Escherichia coli (EcPol I) and the majority of other DNA polymerase-exonucleases. Motifs I and II were conventionally organized and anchored functionally by the predicted carboxylate residues. However, the conventional downstream motif, motif III, was replaced by motif III epsilon, a novel 55-amino-acid (aa) segment incorporating three essential aa (His565, Asp533 and Asp570) which are strictly conserved in three Gram+ Pol III and in the Ec Exo epsilon (epsilon). Despite its unique substructure, the Gram+ Pol III-specific Exo site was conventionally independent of Pol, the site of 2'-deoxyribonucleoside 5-triphosphate (dNTP) binding and polymerization. The entire Exo site, including motif III epsilon, could be deleted without profoundly affecting the enzyme's capacity to polymerize dNTPs. Conversely, Pol and all other sequences downstream of the Exo site could be deleted with little apparent effect on Exo activity. Whether the three essential aa within the unique motif III epsilon substructure participate in the conventional two-metal-ion mechanism elucidated for the model Exo site of EcPol I, remains to be established.

  13. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

    PubMed Central

    2014-01-01

    Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519

  14. Ubiquitous presence of the hammerhead ribozyme motif along the tree of life

    PubMed Central

    de la Peña, Marcos; García-Robles, Inmaculada

    2010-01-01

    Examples of small self-cleaving RNAs embedded in noncoding regions already have been found to be involved in the control of gene expression, although their origin remains uncertain. In this work, we show the widespread occurrence of the hammerhead ribozyme (HHR) motif among genomes from the Bacteria, Chromalveolata, Plantae, and Metazoa kingdoms. Intergenic HHRs were detected in three different bacterial genomes, whereas metagenomic data from Galapagos Islands showed the occurrence of similar ribozymes that could be regarded as direct relics from the RNA world. Among eukaryotes, HHRs were detected in the genomes of three water molds as well as 20 plant species, ranging from unicellular algae to vascular plants. These HHRs were very similar to those previously described in small RNA plant pathogens and, in some cases, appeared as close tandem repetitions. A parallel situation of tandemly repeated HHR motifs was also detected in the genomes of lower metazoans from cnidarians to invertebrates, with special emphasis among hematophagous and parasitic organisms. Altogether, these findings unveil the HHR as a widespread motif in DNA genomes, which would be involved in new forms of retrotransposable elements. PMID:20705646

  15. Mechanisms of Zero-Lag Synchronization in Cortical Motifs

    PubMed Central

    Gollo, Leonardo L.; Mirasso, Claudio; Sporns, Olaf; Breakspear, Michael

    2014-01-01

    Zero-lag synchronization between distant cortical areas has been observed in a diversity of experimental data sets and between many different regions of the brain. Several computational mechanisms have been proposed to account for such isochronous synchronization in the presence of long conduction delays: Of these, the phenomenon of “dynamical relaying” – a mechanism that relies on a specific network motif – has proven to be the most robust with respect to parameter mismatch and system noise. Surprisingly, despite a contrary belief in the community, the common driving motif is an unreliable means of establishing zero-lag synchrony. Although dynamical relaying has been validated in empirical and computational studies, the deeper dynamical mechanisms and comparison to dynamics on other motifs is lacking. By systematically comparing synchronization on a variety of small motifs, we establish that the presence of a single reciprocally connected pair – a “resonance pair” – plays a crucial role in disambiguating those motifs that foster zero-lag synchrony in the presence of conduction delays (such as dynamical relaying) from those that do not (such as the common driving triad). Remarkably, minor structural changes to the common driving motif that incorporate a reciprocal pair recover robust zero-lag synchrony. The findings are observed in computational models of spiking neurons, populations of spiking neurons and neural mass models, and arise whether the oscillatory systems are periodic, chaotic, noise-free or driven by stochastic inputs. The influence of the resonance pair is also robust to parameter mismatch and asymmetrical time delays amongst the elements of the motif. We call this manner of facilitating zero-lag synchrony resonance-induced synchronization, outline the conditions for its occurrence, and propose that it may be a general mechanism to promote zero-lag synchrony in the brain. PMID:24763382

  16. Systems chemistry: logic gates, arithmetic units, and network motifs in small networks.

    PubMed

    Wagner, Nathaniel; Ashkenasy, Gonen

    2009-01-01

    A mixture of molecules can be regarded as a network if all the molecular components participate in some kind of interaction with other molecules--either physical or functional interactions. Template-assisted ligation reactions that direct replication processes can serve as the functional elements that connect two members of a chemical network. In such a process, the template does not necessarily catalyze its own formation, but rather the formation of another molecule, which in turn can operate as a template for reactions within the network medium. It was postulated that even networks made up of small numbers of molecules possess a wealth of molecular information sufficient to perform rather complex behavior. To probe this assumption, we have constructed virtual arrays consisting of three replicating molecules, in which dimer templates are capable of catalyzing reactants to form additional templates. By using realistic parameters from peptides or DNA replication experiments, we simulate the construction of various functional motifs within the networks. Specifically, we have designed and implemented each of the three-element Boolean logic gates, and show how these networks are assembled from four basic "building blocks". We also show how the catalytic pathways can be wired together to perform more complex arithmetic units and network motifs, such as the half adder and half subtractor computational modules, and the coherent feed-forward loop network motifs under different sets of parameters. As in previous studies of chemical networks, some of the systems described display behavior that would be difficult to predict without the numerical simulations. Furthermore, the simulations reveal trends and characteristics that should be useful as "recipes" for future design of experimental functional motifs and for potential integration into modular circuits and molecular computation devices.

  17. Identification of a common hyaluronan binding motif in the hyaluronan binding proteins RHAMM, CD44 and link protein.

    PubMed Central

    Yang, B; Yang, B L; Savani, R C; Turley, E A

    1994-01-01

    We have previously identified two hyaluronan (HA) binding domains in the HA receptor, RHAMM, that occur near the carboxyl-terminus of this protein. We show here that these two HA binding domains are the only HA binding regions in RHAMM, and that they contribute approximately equally to the HA binding ability of this receptor. Mutation of domain II using recombinant polypeptides of RHAMM demonstrates that K423 and R431, spaced seven amino acids apart, are critical for HA binding activity. Domain I contains two sets of two basic amino acids, each spaced seven residues apart, and mutation of these basic amino acids reduced their binding to HA--Sepharose. These results predict that two basic amino acids flanking a seven amino acid stretch [hereafter called B(X7)B] are minimally required for HA binding activity. To assess whether this motif predicts HA binding in the intact RHAMM protein, we mutated all basic amino acids in domains I and II that form part of these motifs using site-directed mutagenesis and prepared fusion protein from the mutated cDNA. The altered RHAMM protein did not bind HA, confirming that the basic amino acids and their spacing are critical for binding. A specific requirement for arginine or lysine residues was identified since mutation of K430, R431 and K432 to histidine residues abolished binding. Clustering of basic amino acids either within or at either end of the motif enhanced HA binding activity while the occurrence of acidic residues between the basic amino acids reduced binding. The B(X7)B motif, in which B is either R or K and X7 contains no acidic residues and at least one basic amino acid, was found in all HA binding proteins molecularly characterized to date. Recombinant techniques were used to generate chimeric proteins containing either the B(X7)B motifs present in CD44 or link protein, with the amino-terminus of RHAMM (amino acids 1-238) that does not bind HA. All chimeric proteins containing the motif bound HA in transblot analyses

  18. Specific regulatory motifs predict glucocorticoid responsiveness of hippocampal gene expression.

    PubMed

    Datson, N A; Polman, J A E; de Jonge, R T; van Boheemen, P T M; van Maanen, E M T; Welten, J; McEwen, B S; Meiland, H C; Meijer, O C

    2011-10-01

    The glucocorticoid receptor (GR) is an ubiquitously expressed ligand-activated transcription factor that mediates effects of cortisol in relation to adaptation to stress. In the brain, GR affects the hippocampus to modulate memory processes through direct binding to glucocorticoid response elements (GREs) in the DNA. However, its effects are to a high degree cell specific, and its target genes in different cell types as well as the mechanisms conferring this specificity are largely unknown. To gain insight in hippocampal GR signaling, we characterized to which GRE GR binds in the rat hippocampus. Using a position-specific scoring matrix, we identified evolutionary-conserved putative GREs from a microarray based set of hippocampal target genes. Using chromatin immunoprecipitation, we were able to confirm GR binding to 15 out of a selection of 32 predicted sites (47%). The majority of these 15 GREs are previously undescribed and thus represent novel GREs that bind GR and therefore may be functional in the rat hippocampus. GRE nucleotide composition was not predictive for binding of GR to a GRE. A search for conserved flanking sequences that may predict GR-GRE interaction resulted in the identification of GC-box associated motifs, such as Myc-associated zinc finger protein 1, within 2 kb of GREs with GR binding in the hippocampus. This enrichment was not present around nonbinding GRE sequences nor around proven GR-binding sites from a mesenchymal stem-like cell dataset that we analyzed. GC-binding transcription factors therefore may be unique partners for DNA-bound GR and may in part explain cell-specific transcriptional regulation by glucocorticoids in the context of the hippocampus.

  19. Elongated polyproline motifs facilitate enamel evolution through matrix subunit compaction.

    PubMed

    Jin, Tianquan; Ito, Yoshihiro; Luan, Xianghong; Dangaria, Smit; Walker, Cameron; Allen, Michael; Kulkarni, Ashok; Gibson, Carolyn; Braatz, Richard; Liao, Xiubei; Diekwisch, Thomas G H

    2009-12-01

    Vertebrate body designs rely on hydroxyapatite as the principal mineral component of relatively light-weight, articulated endoskeletons and sophisticated tooth-bearing jaws, facilitating rapid movement and efficient predation. Biological mineralization and skeletal growth are frequently accomplished through proteins containing polyproline repeat elements. Through their well-defined yet mobile and flexible structure polyproline-rich proteins control mineral shape and contribute many other biological functions including Alzheimer's amyloid aggregation and prolamine plant storage. In the present study we have hypothesized that polyproline repeat proteins exert their control over biological events such as mineral growth, plaque aggregation, or viscous adhesion by altering the length of their central repeat domain, resulting in dramatic changes in supramolecular assembly dimensions. In order to test our hypothesis, we have used the vertebrate mineralization protein amelogenin as an exemplar and determined the biological effect of the four-fold increased polyproline tandem repeat length in the amphibian/mammalian transition. To study the effect of polyproline repeat length on matrix assembly, protein structure, and apatite crystal growth, we have measured supramolecular assembly dimensions in various vertebrates using atomic force microscopy, tested the effect of protein assemblies on crystal growth by electron microscopy, generated a transgenic mouse model to examine the effect of an abbreviated polyproline sequence on crystal growth, and determined the structure of polyproline repeat elements using 3D NMR. Our study shows that an increase in PXX/PXQ tandem repeat motif length results (i) in a compaction of protein matrix subunit dimensions, (ii) reduced conformational variability, (iii) an increase in polyproline II helices, and (iv) promotion of apatite crystal length. Together, these findings establish a direct relationship between polyproline tandem repeat fragment

  20. Elongated Polyproline Motifs Facilitate Enamel Evolution through Matrix Subunit Compaction

    PubMed Central

    Luan, Xianghong; Dangaria, Smit; Walker, Cameron; Allen, Michael; Kulkarni, Ashok; Gibson, Carolyn; Braatz, Richard; Liao, Xiubei; Diekwisch, Thomas G. H.

    2009-01-01

    Vertebrate body designs rely on hydroxyapatite as the principal mineral component of relatively light-weight, articulated endoskeletons and sophisticated tooth-bearing jaws, facilitating rapid movement and efficient predation. Biological mineralization and skeletal growth are frequently accomplished through proteins containing polyproline repeat elements. Through their well-defined yet mobile and flexible structure polyproline-rich proteins control mineral shape and contribute many other biological functions including Alzheimer's amyloid aggregation and prolamine plant storage. In the present study we have hypothesized that polyproline repeat proteins exert their control over biological events such as mineral growth, plaque aggregation, or viscous adhesion by altering the length of their central repeat domain, resulting in dramatic changes in supramolecular assembly dimensions. In order to test our hypothesis, we have used the vertebrate mineralization protein amelogenin as an exemplar and determined the biological effect of the four-fold increased polyproline tandem repeat length in the amphibian/mammalian transition. To study the effect of polyproline repeat length on matrix assembly, protein structure, and apatite crystal growth, we have measured supramolecular assembly dimensions in various vertebrates using atomic force microscopy, tested the effect of protein assemblies on crystal growth by electron microscopy, generated a transgenic mouse model to examine the effect of an abbreviated polyproline sequence on crystal growth, and determined the structure of polyproline repeat elements using 3D NMR. Our study shows that an increase in PXX/PXQ tandem repeat motif length results (i) in a compaction of protein matrix subunit dimensions, (ii) reduced conformational variability, (iii) an increase in polyproline II helices, and (iv) promotion of apatite crystal length. Together, these findings establish a direct relationship between polyproline tandem repeat fragment

  1. IQ-motif peptides as novel anti-microbial agents.

    PubMed

    McLean, Denise T F; Lundy, Fionnuala T; Timson, David J

    2013-04-01

    The IQ-motif is an amphipathic, often positively charged, α-helical, calmodulin binding sequence found in a number of eukaryote signalling, transport and cytoskeletal proteins. They share common biophysical characteristics with established, cationic α-helical antimicrobial peptides, such as the human cathelicidin LL-37. Therefore, we tested eight peptides encoding the sequences of IQ-motifs derived from the human cytoskeletal scaffolding proteins IQGAP2 and IQGAP3. Some of these peptides were able to inhibit the growth of Escherichia coli and Staphylococcus aureus with minimal inhibitory concentrations (MIC) comparable to LL-37. In addition some IQ-motifs had activity against the fungus Candida albicans. This antimicrobial activity is combined with low haemolytic activity (comparable to, or lower than, that of LL-37). Those IQ-motifs with anti-microbial activity tended to be able to bind to lipopolysaccharide. Some of these were also able to permeabilise the cell membranes of both Gram positive and Gram negative bacteria. These results demonstrate that IQ-motifs are viable lead sequences for the identification and optimisation of novel anti-microbial peptides. Thus, further investigation of the anti-microbial properties of this diverse group of sequences is merited.

  2. Interconnected Network Motifs Control Podocyte Morphology and Kidney Function

    PubMed Central

    Azeloglu, Evren U.; Hardy, Simon V.; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y.; Fang, Wei; Xiong, Huabao; Neves, Susana R.; Jain, Mohit R.; Li, Hong; Ma’ayan, Avi; Gordon, Ronald E.; He, John Cijiang; Iyengar, Ravi

    2014-01-01

    Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3′,5′-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element–binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor–driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease. PMID:24497609

  3. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  4. Fitting a mixture model by expectation maximization to discover motifs in biopolymers

    SciTech Connect

    Bailey, T.L.; Elkan, C.

    1994-12-31

    The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences. Multiple motifs are found by fitting a mixture model to the data, probabilistically erasing the occurrences of the motif thus found, and repeating the process to find successive motifs. The algorithm requires only a set of unaligned sequences and a number specifying the width of the motifs as input. It returns a model of each motif and a threshold which together can be used as a Bayes-optimal classifier for searching for occurrences of the motif in other databases. The algorithm estimates how many times each motif occurs in each sequence in the dataset and outputs an alignment of the occurrences of the motif. The algorithm is capable of discovering several different motifs with differing numbers of occurrences in a single dataset.

  5. A role for specific collagen motifs during wound healing and inflammatory response of fibroblasts in the teleost fish gilthead seabream.

    PubMed

    Castillo-Briceño, Patricia; Bihan, Dominique; Nilges, Michael; Hamaia, Samir; Meseguer, José; García-Ayala, Alfonsa; Farndale, Richard W; Mulero, Victoriano

    2011-03-01

    Specific sites and sequences in collagen to which cells can attach, either directly or through protein intermediaries, were identified using Toolkits of 63-amino acid triple-helical peptides and specific shorter GXX'GEX″ motifs, which have different intrinsic affinity for integrins that mediate cell adhesion and migration. We have previously reported that collagen type I (COL-I) was able to prime in vitro the respiratory burst and induce a specific set of immune- and extracellular matrix-related molecules in phagocytes of the teleost fish gilthead seabream (Sparus aurata L.). It was also suggested that COL-I would provide an intermediate signal during the early inflammatory response in gilthead seabream. Since fibroblasts are highly involved in the initiation of wound repair and regeneration processes, in the present study SAF-1 cells (gilthead seabream fibroblasts) were used to identify the binding motifs in collagen by end-point and real-time cell adhesion assays using the collagen peptides and Toolkits. We identified the collagen motifs involved in the early magnesium-dependent adhesion of these cells. Furthermore, we found that peptides containing the GFOGER and GLOGEN motifs (where O is hydroxyproline) present high affinity for SAF-1 adhesion, expressed as both cell number and surface covering, while in cell suspensions, these motifs were also able to induce the expression of the genes encoding the proinflammatory molecules interleukin-1β and cyclooxygenase-2. These data suggest that specific collagen motifs are involved in the regulation of the inflammatory and healing responses of teleost fish.

  6. Structural and functional insights into the regulation of Helicobacter pylori arginase activity by an evolutionary nonconserved motif.

    PubMed

    Srivastava, Abhishek; Meena, Shiv Kumar; Alam, Mashkoor; Nayeem, Shahid M; Deep, Shashank; Sau, Apurba Kumar

    2013-01-22

    Urea producing bimetallic arginases are essential for the synthesis of polyamine, DNA, and RNA. Despite conservation of the signature motifs in all arginases, a nonconserved ¹⁵³ESEEKAWQKLCSL¹⁶⁵ motif is found in the Helicobacter pylori enzyme, whose role is yet unknown. Using site-directed mutagenesis, kinetic assays, metal analyses, circular dichroism, heat-induced denaturation, molecular dynamics simulations and truncation studies, we report here the significance of this motif in catalytic function, metal retention, structural integrity, and stability of the protein. The enzyme did not exhibit detectable activity upon deletion of the motif as well as on individual mutation of Glu155 and Trp159 while Cys163Ala displayed significant decrease in the activity. Trp159Ala and Glu155Ala show severe loss of thermostability (14-17°) by a decrease in the α-helical structure. The role of Trp159 in stabilization of the structure with the surrounding aromatic residues is confirmed when Trp159Phe restored the structure and stability substantially compared to Trp159Ala. The simulation studies support the above results and show that the motif, which was previously solvent exposed, displays a loop-cum-small helix structure (Lys161-Cys163) and is located near the active-site through a novel Trp159-Asp126 interaction. This is consistent with the mutational analyses, where Trp159 and Asp126 are individually critical for retaining a bimetallic center and thereby for function. Furthermore, Cys163 of the helix is primarily important for dimerization, which is crucial for stimulation of the activity. Thus, these findings not only provide insights into the role of this motif but also offer a possibility to engineer it in human arginases for therapeutics against a number of carcinomas.

  7. The Membrane-Bound NAC Transcription Factor ANAC013 Functions in Mitochondrial Retrograde Regulation of the Oxidative Stress Response in Arabidopsis[C][W

    PubMed Central

    De Clercq, Inge; Vermeirssen, Vanessa; Van Aken, Olivier; Vandepoele, Klaas; Murcha, Monika W.; Law, Simon R.; Inzé, Annelies; Ng, Sophia; Ivanova, Aneta; Rombaut, Debbie; van de Cotte, Brigitte; Jaspers, Pinja; Van de Peer, Yves; Kangasjärvi, Jaakko; Whelan, James; Van Breusegem, Frank

    2013-01-01

    Upon disturbance of their function by stress, mitochondria can signal to the nucleus to steer the expression of responsive genes. This mitochondria-to-nucleus communication is often referred to as mitochondrial retrograde regulation (MRR). Although reactive oxygen species and calcium are likely candidate signaling molecules for MRR, the protein signaling components in plants remain largely unknown. Through meta-analysis of transcriptome data, we detected a set of genes that are common and robust targets of MRR and used them as a bait to identify its transcriptional regulators. In the upstream regions of these mitochondrial dysfunction stimulon (MDS) genes, we found a cis-regulatory element, the mitochondrial dysfunction motif (MDM), which is necessary and sufficient for gene expression under various mitochondrial perturbation conditions. Yeast one-hybrid analysis and electrophoretic mobility shift assays revealed that the transmembrane domain–containing NO APICAL MERISTEM/ARABIDOPSIS TRANSCRIPTION ACTIVATION FACTOR/CUP-SHAPED COTYLEDON transcription factors (ANAC013, ANAC016, ANAC017, ANAC053, and ANAC078) bound to the MDM cis-regulatory element. We demonstrate that ANAC013 mediates MRR-induced expression of the MDS genes by direct interaction with the MDM cis-regulatory element and triggers increased oxidative stress tolerance. In conclusion, we characterized ANAC013 as a regulator of MRR upon stress in Arabidopsis thaliana. PMID:24045019

  8. DNA consensus sequence motif for binding response regulator PhoP, a virulence regulator of Mycobacterium tuberculosis.

    PubMed

    He, Xiaoyuan; Wang, Shuishu

    2014-12-30

    Tuberculosis has reemerged as a serious threat to human health because of the increasing prevalence of drug-resistant strains and synergetic infection with HIV, prompting an urgent need for new and more efficient treatments. The PhoP-PhoR two-component system of Mycobacterium tuberculosis plays an important role in the virulence of the pathogen and thus represents a potential drug target. To study the mechanism of gene transcription regulation by response regulator PhoP, we identified a high-affinity DNA sequence for PhoP binding using systematic evolution of ligands by exponential enrichment. The sequence contains a direct repeat of two 7 bp motifs separated by a 4 bp spacer, TCACAGC(N4)TCACAGC. The specificity of the direct-repeat sequence for PhoP binding was confirmed by isothermal titration calorimetry and electrophoretic mobility shift assays. PhoP binds to the direct repeat as a dimer in a highly cooperative manner. We found many genes previously identified to be regulated by PhoP that contain the direct-repeat motif in their promoter sequences. Synthetic DNA fragments at the putative promoter-binding sites bind PhoP with variable affinity, which is related to the number of mismatches in the 7 bp motifs, the positions of the mismatches, and the spacer and flanking sequences. Phosphorylation of PhoP increases the affinity but does not change the specificity of DNA binding. Overall, our results confirm the direct-repeat sequence as the consensus motif for PhoP binding and thus pave the way for identification of PhoP directly regulated genes in different mycobacterial genomes.

  9. A General RNA Motif for Cellular Transfection

    PubMed Central

    Magalhães, Maria LB; Byrom, Michelle; Yan, Amy; Kelly, Linsley; Li, Na; Furtado, Raquel; Palliser, Deborah; Ellington, Andrew D; Levy, Matthew

    2012-01-01

    We have developed a selection scheme to generate nucleic acid sequences that recognize and directly internalize into mammalian cells without the aid of conventional delivery methods. To demonstrate the generality of the technology, two independent selections with different starting pools were performed against distinct target cells. Each selection yielded a single highly functional sequence, both of which folded into a common core structure. This internalization signal can be adapted for use as a general purpose reagent for transfection into a wide variety of cell types including primary cells. PMID:22233578

  10. Selection against spurious promoter motifs correlates withtranslational efficiency across bacteria

    SciTech Connect

    Froula, Jeffrey L.; Francino, M. Pilar

    2007-05-01

    Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the -10 promoter motifs that bind the {sigma}{sup 70} subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of -10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, -10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also implies that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.

  11. [Specific motifs in the genomes of the family Chlamydiaceae].

    PubMed

    Demkin, V V; Kirillova, N V

    2012-01-01

    Specific motifs in the genomes of the family Chlamydiaceae were discussed. The search for genetic markers ofbacteria identification and typing is an urgent problem. The progress in sequencing technology resulted in compilation of the database of genomic nucleotide sequences of bacteria. This raised the problem of the search and selection of genetic targets for identification and typing in bacterial genes based on comparative analysis of complete genomic sequences. The goal of this work was to implement comparative genetic analysis of different species of the family Chlamydiaceae. This analysis was focused to detection of specific motifs capable of serving as genetic marker of this family. The consensus domains were detected using the Visual Basic for Application software for MS Excel. Complete coincidence of segments 25 nucleotide long was used as the test for consensus domain selection. One complete genomic sequence for each of 8 bacterial species was taken for the experiment. The experimental sample did not contain complete sequence of C. suis, because at the moment of this research this species was absence in the database GenBank. Comparative assay of the sequences of the C. trachomatis and other representatives of the family Chlamydiaceae revealed 41 common motifs for 8 Chlamydiaceae species tested in this work. The maximal number of consensus motifs was observed in genes of ribosomal RNA and t-RNA. In addition to genes of r-RNA and t-RNA consensus motifs were observed in 5 genes and 6 intergene segments. The gene CTL0299, CTLO800, dagA, and hctA consensus motifs detected in this work can be regarded as identification domains of the family Chlamydiaceae.

  12. Specific RNA self-assembly with minimal paranemic motifs.

    PubMed

    Afonin, Kirill A; Cieply, Dennis J; Leontis, Neocles B

    2008-01-09

    The paranemic crossover (PX) is a motif for assembling two nucleic acid molecules using Watson-Crick (WC) basepairing without unfolding preformed secondary structure in the individual molecules. Once formed, the paranemic assembly motif comprises adjacent parallel double helices that crossover at every possible point over the length of the motif. The interaction is reversible as it does not require denaturation of basepairs internal to each interacting molecular unit. Paranemic assembly has been demonstrated for DNA but not for RNA and only for motifs with four or more crossover points and lengths of five or more helical half-turns. Here we report the design of RNA molecules that paranemically assemble with the minimum number of two crossovers spanning the major groove to form paranemic motifs with a length of three half turns (3HT). Dissociation constants (Kd's) were measured for a series of molecules in which the number of basepairs between the crossover points was varied from five to eight basepairs. The paranemic 3HT complex with six basepairs (3HT_6M) was found to be the most stable with Kd = 1 x 10-8 M. The half-time for kinetic exchange of the 3HT_6M complex was determined to be approximately 100 min, from which we calculated association and dissociation rate constants ka = 5.11 x 103 M-1s-1 and kd = 5.11 x 10-5 s-1. RNA paranemic assembly of 3HT and 5HT complexes is blocked by single-base substitutions that disrupt individual intermolecular Watson-Crick basepairs and is restored by compensatory substitutions that restore those basepairs. The 3HT motif appears suitable for specific, programmable, and reversible tecto-RNA self-assembly for constructing artificial RNA molecular machines.

  13. Probabilistic generation of random networks taking into account information on motifs occurrence.

    PubMed

    Bois, Frederic Y; Gayraud, Ghislaine

    2015-01-01

    Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli.

  14. Characterizing regulatory path motifs in integrated networks using perturbational data

    PubMed Central

    2010-01-01

    We introduce Pathicular http://bioinformatics.psb.ugent.be/software/details/Pathicular, a Cytoscape plugin for studying the cellular response to perturbations of transcription factors by integrating perturbational expression data with transcriptional, protein-protein and phosphorylation networks. Pathicular searches for 'regulatory path motifs', short paths in the integrated physical networks which occur significantly more often than expected between transcription factors and their targets in the perturbational data. A case study in Saccharomyces cerevisiae identifies eight regulatory path motifs and demonstrates their biological significance. PMID:20230615

  15. A Command Editor Tool for X and Motif

    DTIC Science & Technology

    1993-07-01

    1of 16 h.. . . .. .. . . . . . .I .... . . . .. . . . . . . .- I m arble X/Motlf Design Document for Contract # DAAH01-93-C-R013 minimal implementation...Motif 2 of 18 m arble X/Motif Design Document for Contract # DAAH01-93-C-R013 ing of modified system widgets, proides to the developer the full source...oa’rutmz ol"croidctv fteseilmd h A iandEio olfrX n oi f1 i~lol’lot m arble Xfflotlf De*ign Documnent for Contract # DAAHOI-93-C-R013 user has just

  16. Application of Synthetic Peptide Arrays To Uncover Cyclic Di-GMP Binding Motifs

    PubMed Central

    Düvel, Juliane; Bense, Sarina; Möller, Stefan; Bertinetti, Daniela; Schwede, Frank; Morr, Michael; Eckweiler, Denitsa; Genieser, Hans-Gottfried; Jänsch, Lothar; Herberg, Friedrich W.; Frank, Ronald

    2015-01-01

    ABSTRACT High levels of the universal bacterial second messenger cyclic di-GMP (c-di-GMP) promote the establishment of surface-attached growth in many bacteria. Not only can c-di-GMP bind to nucleic acids and directly control gene expression, but it also binds to a diverse array of proteins of specialized functions and orchestrates their activity. Since its development in the early 1990s, the synthetic peptide array technique has become a powerful tool for high-throughput approaches and was successfully applied to investigate the binding specificity of protein-ligand interactions. In this study, we used peptide arrays to uncover the c-di-GMP binding site of a Pseudomonas aeruginosa protein (PA3740) that was isolated in a chemical proteomics approach. PA3740 was shown to bind c-di-GMP with a high affinity, and peptide arrays uncovered LKKALKKQTNLR to be a putative c-di-GMP binding motif. Most interestingly, different from the previously identified c-di-GMP binding motif of the PilZ domain (RXXXR) or the I site of diguanylate cyclases (RXXD), two leucine residues and a glutamine residue and not the charged amino acids provided the key residues of the binding sequence. Those three amino acids are highly conserved across PA3740 homologs, and their singular exchange to alanine reduced c-di-GMP binding within the full-length protein. IMPORTANCE In many bacterial pathogens the universal bacterial second messenger c-di-GMP governs the switch from the planktonic, motile mode of growth to the sessile, biofilm mode of growth. Bacteria adapt their intracellular c-di-GMP levels to a variety of environmental challenges. Several classes of c-di-GMP binding proteins have been structurally characterized, and diverse c-di-GMP binding domains have been identified. Nevertheless, for several c-di-GMP receptors, the binding motif remains to be determined. Here we show that the use of a synthetic peptide array allowed the identification of a c-di-GMP binding motif of a putative c

  17. Novel DNA motif binding activity observed in vivo with an estrogen receptor α mutant mouse.

    PubMed

    Hewitt, Sylvia C; Li, Leping; Grimm, Sara A; Winuthayanon, Wipawee; Hamilton, Katherine J; Pockette, Brianna; Rubel, Cory A; Pedersen, Lars C; Fargo, David; Lanz, Rainer B; DeMayo, Francesco J; Schütz, Günther; Korach, Kenneth S

    2014-06-01

    Estrogen receptor α (ERα) interacts with DNA directly or indirectly via other transcription factors, referred to as "tethering." Evidence for tethering is based on in vitro studies and a widely used "KIKO" mouse model containing mutations that prevent direct estrogen response element DNA- binding. KIKO mice are infertile, due in part to the inability of estradiol (E2) to induce uterine epithelial proliferation. To elucidate the molecular events that prevent KIKO uterine growth, regulation of the pro-proliferative E2 target gene Klf4 and of Klf15, a progesterone (P4) target gene that opposes the pro-proliferative activity of KLF4, was evaluated. Klf4 induction was impaired in KIKO uteri; however, Klf15 was induced by E2 rather than by P4. Whole uterine chromatin immunoprecipitation-sequencing revealed enrichment of KIKO ERα binding to hormone response elements (HREs) motifs. KIKO binding to HRE motifs was verified using reporter gene and DNA-binding assays. Because the KIKO ERα has HRE DNA-binding activity, we evaluated the "EAAE" ERα, which has more severe DNA-binding domain mutations, and demonstrated a lack of estrogen response element or HRE reporter gene induction or DNA-binding. The EAAE mouse has an ERα null-like phenotype, with impaired uterine growth and transcriptional activity. Our findings demonstrate that the KIKO mouse model, which has been used by numerous investigators, cannot be used to establish biological functions for ERα tethering, because KIKO ERα effectively stimulates transcription using HRE motifs. The EAAE-ERα DNA-binding domain mutant mouse demonstrates that ERα DNA-binding is crucial for biological and transcriptional processes in reproductive tissues and that ERα tethering may not contribute to estrogen responsiveness in vivo.

  18. Novel DNA Motif Binding Activity Observed In Vivo With an Estrogen Receptor α Mutant Mouse

    PubMed Central

    Li, Leping; Grimm, Sara A.; Winuthayanon, Wipawee; Hamilton, Katherine J.; Pockette, Brianna; Rubel, Cory A.; Pedersen, Lars C.; Fargo, David; Lanz, Rainer B.; DeMayo, Francesco J.; Schütz, Günther; Korach, Kenneth S.

    2014-01-01

    Estrogen receptor α (ERα) interacts with DNA directly or indirectly via other transcription factors, referred to as “tethering.” Evidence for tethering is based on in vitro studies and a widely used “KIKO” mouse model containing mutations that prevent direct estrogen response element DNA- binding. KIKO mice are infertile, due in part to the inability of estradiol (E2) to induce uterine epithelial proliferation. To elucidate the molecular events that prevent KIKO uterine growth, regulation of the pro-proliferative E2 target gene Klf4 and of Klf15, a progesterone (P4) target gene that opposes the pro-proliferative activity of KLF4, was evaluated. Klf4 induction was impaired in KIKO uteri; however, Klf15 was induced by E2 rather than by P4. Whole uterine chromatin immunoprecipitation-sequencing revealed enrichment of KIKO ERα binding to hormone response elements (HREs) motifs. KIKO binding to HRE motifs was verified using reporter gene and DNA-binding assays. Because the KIKO ERα has HRE DNA-binding activity, we evaluated the “EAAE” ERα, which has more severe DNA-binding domain mutations, and demonstrated a lack of estrogen response element or HRE reporter gene induction or DNA-binding. The EAAE mouse has an ERα null–like phenotype, with impaired uterine growth and transcriptional activity. Our findings demonstrate that the KIKO mouse model, which has been used by numerous investigators, cannot be used to establish biological functions for ERα tethering, because KIKO ERα effectively stimulates transcription using HRE motifs. The EAAE-ERα DNA-binding domain mutant mouse demonstrates that ERα DNA-binding is crucial for biological and transcriptional processes in reproductive tissues and that ERα tethering may not contribute to estrogen responsiveness in vivo. PMID:24713037

  19. Identification and characterization of a selenoprotein family containing a diselenide bond in a redox motif.

    PubMed

    Shchedrina, Valentina A; Novoselov, Sergey V; Malinouski, Mikalai Yu; Gladyshev, Vadim N

    2007-08-28

    Selenocysteine (Sec, U) insertion into proteins is directed by translational recoding of specific UGA codons located upstream of a stem-loop structure known as Sec insertion sequence (SECIS) element. Selenoproteins with known functions are oxidoreductases containing a single redox-active Sec in their active sites. In this work, we identified a family of selenoproteins, designated SelL, containing two Sec separated by two other residues to form a UxxU motif. SelL proteins show an unusual occurrence, being present in diverse aquatic organisms, including fish, invertebrates, and marine bacteria. Both eukaryotic and bacterial SelL genes use single SECIS elements for insertion of two Sec. In eukaryotes, the SECIS is located in the 3' UTR, whereas the bacterial SelL SECIS is within a coding region and positioned at a distance that supports the insertion of either of the two Sec or both of these residues. SelL proteins possess a thioredoxin-like fold wherein the UxxU motif corresponds to the catalytic CxxC motif in thioredoxins, suggesting a redox function of SelL proteins. Distantly related SelL-like proteins were also identified in a variety of organisms that had either one or both Sec replaced with Cys. Danio rerio SelL, transiently expressed in mammalian cells, incorporated two Sec and localized to the cytosol. In these cells, it occurred in an oxidized form and was not reducible by DTT. In a bacterial expression system, we directly demonstrated the formation of a diselenide bond between the two Sec, establishing it as the first diselenide bond found in a natural protein.

  20. Functional tissue units and their primary tissue motifs in multi-scale physiology

    PubMed Central

    2013-01-01

    Background Histology information management relies on complex knowledge derived from morphological tissue analyses. These approaches have not significantly facilitated the general integration of tissue- and molecular-level knowledge across the board in support of a systematic classification of tissue function, as well as the coherent multi-scale study of physiology. Our work aims to support directly these integrative goals. Results We describe, for the first time, the precise biophysical and topological characteristics of functional units of tissue. Such a unit consists of a three-dimensional block of cells centred around a capillary, such that each cell in this block is within diffusion distance from any other cell in the same block. We refer to this block as a functional tissue unit. As a means of simplifying the knowledge representation of this unit, and rendering this knowledge more amenable to automated reasoning and classification, we developed a simple descriptor of its cellular content and anatomical location, which we refer to as a primary tissue motif. In particular, a primary motif captures the set of cellular participants of diffusion-mediated interactions brokered by secreted products to create a tissue-level molecular network. Conclusions Multi-organ communication, therefore, may be interpreted in terms of interactions between molecular networks housed by interconnected functional tissue units. By extension, a functional picture of an organ, or its tissue components, may be rationally assembled using a collection of these functional tissue units as building blocks. In our work, we outline the biophysical rationale for a rigorous definition of a unit of functional tissue organization, and demonstrate the application of primary motifs in tissue classification. In so doing, we acknowledge (i) the fundamental role of capillaries in directing and radically informing tissue architecture, as well as (ii) the importance of taking into full account the

  1. Nephila clavipes Flagelliform Silk-like GGX Motifs Contribute to Extensibility and Spacer Motifs Contribute to Strength in Synthetic Spider Silk Fibers

    PubMed Central

    Adrianos, Sherry L.; Teulé, Florence; Hinman, Michael B.; Jones, Justin A.; Weber, Warner S.; Yarger, Jeffery L.; Lewis, Randolph V.

    2013-01-01

    Flagelliform spider silk is the most extensible silk fiber produced by orb weaver spiders, though not as strong as the dragline silk of the spider. The motifs found in the core of the Nephila clavipes flagelliform Flag protein are: GGX, spacer, and GPGGX. Flag does not contain the polyalanine motif known to provide the strength of dragline silk. To investigate the source of flagelliform fiber strength, four recombinant proteins were produced containing variations of the three core motifs of the Nephila clavipes flagelliform Flag protein that produces this type of fiber. The as-spun fibers were processed in 80% aqueous isopropanol using a standardized process for all four fiber types, which produced improved mechanical properties. Mechanical testing of the recombinant proteins determined that the GGX motif contributes extensibility and the spacer motif contributes strength to the recombinant fibers. Recombinant protein fibers containing the spacer motif were stronger than the proteins constructed without the spacer that contained only the GGX motif or the combination of the GGX and GPGGX motifs. The mechanical and structural X-ray diffraction analysis of the recombinant fibers provide data that suggests a functional role of the spacer motif that produces tensile strength though the spacer motif is not clearly defined structurally. These results indicate that the spacer is likely a primary contributor of strength with the GGX motif supplying mobility to the protein network of native N. clavipes flagelliform silk fibers. PMID:23646825

  2. Nephila clavipes Flagelliform silk-like GGX motifs contribute to extensibility and spacer motifs contribute to strength in synthetic spider silk fibers.

    PubMed

    Adrianos, Sherry L; Teulé, Florence; Hinman, Michael B; Jones, Justin A; Weber, Warner S; Yarger, Jeffery L; Lewis, Randolph V

    2013-06-10

    Flagelliform spider silk is the most extensible silk fiber produced by orb weaver spiders, though not as strong as the dragline silk of the spider. The motifs found in the core of the Nephila clavipes flagelliform Flag protein are GGX, spacer, and GPGGX. Flag does not contain the polyalanine motif known to provide the strength of dragline silk. To investigate the source of flagelliform fiber strength, four recombinant proteins were produced containing variations of the three core motifs of the Nephila clavipes flagelliform Flag protein that produces this type of fiber. The as-spun fibers were processed in 80% aqueous isopropanol using a standardized process for all four fiber types, which produced improved mechanical properties. Mechanical testing of the recombinant proteins determined that the GGX motif contributes extensibility and the spacer motif contributes strength to the recombinant fibers. Recombinant protein fibers containing the spacer motif were stronger than the proteins constructed without the spacer that contained only the GGX motif or the combination of the GGX and GPGGX motifs. The mechanical and structural X-ray diffraction analysis of the recombinant fibers provide data that suggests a functional role of the spacer motif that produces tensile strength, though the spacer motif is not clearly defined structurally. These results indicate that the spacer is likely a primary contributor of strength, with the GGX motif supplying mobility to the protein network of native N. clavipes flagelliform silk fibers.

  3. Mutations in Two Putative Phosphorylation Motifs in the Tomato Pollen Receptor Kinase LePRK2 Show Antagonistic Effects on Pollen Tube Length*

    PubMed Central

    Salem, Tamara; Mazzella, Agustina; Barberini, María Laura; Wengier, Diego; Motillo, Viviana; Parisi, Gustavo; Muschietti, Jorge

    2011-01-01

    The tip-growing pollen tube is a useful model for studying polarized cell growth in plants. We previously characterized LePRK2, a pollen-specific receptor-like kinase from tomato (1). Here, we showed that LePRK2 is present as multiple phosphorylated isoforms in mature pollen membranes. Using comparative sequence analysis and phosphorylation site prediction programs, we identified two putative phosphorylation motifs in the cytoplasmic juxtamembrane (JM) domain. Site-directed mutagenesis in these motifs, followed by transient overexpression in tobacco pollen, showed that both motifs have opposite effects in regulating pollen tube length. Relative to LePRK2-eGFP pollen tubes, alanine substitutions in residues of motif I, Ser277/Ser279/Ser282, resulted in longer pollen tubes, but alanine substitutions in motif II, Ser304/Ser307/Thr308, resulted in shorter tubes. In contrast, phosphomimicking aspartic substitutions at these residues gave reciprocal results, that is, shorter tubes with mutations in motif I and longer tubes with mutations in motif II. We conclude that the length of pollen tubes can be negatively and positively regulated by phosphorylation of residues in motif I and II respectively. We also showed that LePRK2-eGFP significantly decreased pollen tube length and increased pollen tube tip width, relative to eGFP tubes. The kinase activity of LePRK2 was relevant for this phenotype because tubes that expressed a mutation in a lysine essential for kinase activity showed the same length and width as the eGFP control. Taken together, these results suggest that LePRK2 may have a central role in pollen tube growth through regulation of its own phosphorylation status. PMID:21131355

  4. Core signalling motif displaying multistability through multi-state enzymes

    PubMed Central

    Feng, Song; Sáez, Meritxell; Wiuf, Carsten; Feliu, Elisenda

    2016-01-01

    Bistability, and more generally multistability, is a key system dynamics feature enabling decision-making and memory in cells. Deciphering the molecular determinants of multistability is thus crucial for a better understanding of cellular pathways and their (re)engineering in synthetic biology. Here, we show that a key motif found predominantly in eukaryotic signalling systems, namely a futile signalling cycle, can display bistability when featuring a two-state kinase. We provide necessary and sufficient mathematical conditions on the kinetic parameters of this motif that guarantee the existence of multiple steady states. These conditions foster the intuition that bistability arises as a consequence of competition between the two states of the kinase. Extending from this result, we find that increasing the number of kinase states linearly translates into an increase in the number of steady states in the system. These findings reveal, to our knowledge, a new mechanism for the generation of bistability and multistability in cellular signalling systems. Further the futile cycle featuring a two-state kinase is among the smallest bistable signalling motifs. We show that multi-state kinases and the described competition-based motif are part of several natural signalling systems and thereby could enable them to implement complex information processing through multistability. These results indicate that multi-state kinases in signalling systems are readily exploited by natural evolution and could equally be used by synthetic approaches for the generation of multistable information processing systems at the cellular level. PMID:27733693

  5. Conditional graphical models for protein structural motif recognition.

    PubMed

    Liu, Yan; Carbonell, Jaime; Gopalakrishnan, Vanathi; Weigele, Peter

    2009-05-01

    Determining protein structures is crucial to understanding the mechanisms of infection and designing drugs. However, the elucidation of protein folds by crystallographic experiments can be a bottleneck in the development process. In this article, we present a probabilistic graphical model framework, conditional graphical models, for predicting protein structural motifs. It represents the structure characteristics of a structural motif using a graph, where the nodes denote the secondary structure elements, and the edges indicate the side-chain interactions between the components either within one protein chain or between chains. Then the model defines the optimal segmentation of a protein sequence against the graph by maximizing its "conditional" probability so that it can take advantages of the discriminative training approach. Efficient approximate inference algorithms using reversible jump Markov Chain Monte Carlo (MCMC) algorithm are developed to handle the resulting complex graphical models. We test our algorithm on four important structural motifs, and our method outperforms other state-of-art algorithms for motif recognition. We also hypothesize potential membership proteins of target folds from Swiss-Prot, which further supports the evolutionary hypothesis about viral folds.

  6. Motifs in triadic random graphs based on Steiner triple systems

    NASA Astrophysics Data System (ADS)

    Winkler, Marco; Reichardt, Jörg

    2013-08-01

    Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.

  7. Forward and Back: Motifs of Inhibition in Olfactory Processing

    PubMed Central

    Bazhenov, Maxim; Stopfer, Mark

    2016-01-01

    The remarkable performance of the olfactory system in classifying and categorizing the complex olfactory environment is built upon several basic neural circuit motifs. These include forms of inhibition that may play comparable roles in widely divergent species. In this issue of Neuron, a new study by Stokes and Isaacson sheds light on how elementary types of inhibition dynamically interact. PMID:20696373

  8. Insights into the motif preference of APOBEC3 enzymes.

    PubMed

    Ebrahimi, Diako; Alinejad-Rokny, Hamid; Davenport, Miles P

    2014-01-01

    We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3' end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3'polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome.

  9. Insights into the Motif Preference of APOBEC3 Enzymes

    PubMed Central

    Ebrahimi, Diako; Alinejad-Rokny, Hamid; Davenport, Miles P.

    2014-01-01

    We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3′ end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3′polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome. PMID:24498164

  10. 5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    5. DETAIL VIEW OF THE EGYPTIAN MOTIF DECORATIVE ELEMENTS OF BUILDING 1'S MAIN ENTRY TOWER (INCLUDING THE ENGAGED COLUMN CAPITALS, PILASTERS & CAPITALS, CORNICES, AND TERRA COTTA EAGLES); LOOKING SW FROM THE E WING ROOF. (Ryan) - Veterans Administration Medical Center, Building No. 1, Old State Route 13 West, Marion, Williamson County, IL

  11. DNA containing CpG motifs induces angiogenesis

    NASA Astrophysics Data System (ADS)

    Zheng, Mei; Klinman, Dennis M.; Gierynska, Malgorzata; Rouse, Barry T.

    2002-06-01

    New blood vessel formation in the cornea is an essential step in the pathogenesis of a blinding immunoinflammatory reaction caused by ocular infection with herpes simplex virus (HSV). By using a murine corneal micropocket assay, we found that HSV DNA (which contains a significant excess of potentially bioactive "CpG" motifs when compared with mammalian DNA) induces angiogenesis. Moreover, synthetic oligodeoxynucleotides containing CpG motifs attract inflammatory cells and stimulate the release of vascular endothelial growth factor (VEGF), which in turn triggers new blood vessel formation. In vitro, CpG DNA induces the J774A.1 murine macrophage cell line to produce VEGF. In vivo CpG-induced angiogenesis was blocked by the administration of anti-mVEGF Ab or the inclusion of "neutralizing" oligodeoxynucleotides that specifically oppose the stimulatory activity of CpG DNA. These findings establish that DNA containing bioactive CpG motifs induces angiogenesis, and suggest that CpG motifs in HSV DNA may contribute to the blinding lesions of stromal keratitis.

  12. Conserved motifs II to VI of DNA helicase II from Escherichia coli are all required for biological activity.

    PubMed Central

    Zhang, G; Deng, E; Baugh, L R; Hamilton, C M; Maples, V F; Kushner, S R

    1997-01-01

    There are seven conserved motifs (IA, IB, and II to VI) in DNA helicase II of Escherichia coli that have high homology among a large family of proteins involved in DNA metabolism. To address the functional importance of motifs II to VI, we employed site-directed mutagenesis to replace the charged amino acid residues in each motif with alanines. Cells carrying these mutant alleles exhibited higher UV and methyl methanesulfonate sensitivity, increased rates of spontaneous mutagenesis, and elevated levels of homologous recombination, indicating defects in both the excision repair and mismatch repair pathways. In addition, we also changed the highly conserved tyrosine(600) in motif VI to phenylalanine (uvrD309, Y600F). This mutant displayed a moderate increase in UV sensitivity but a decrease in spontaneous mutation rate, suggesting that DNA helicase II may have different functions in the two DNA repair pathways. Furthermore, a mutation in domain IV (uvrD307, R284A) significantly reduced the viability of some E. coli K-12 strains at 30 degrees C but not at 37 degrees C. The implications of these observations are discussed. PMID:9393722

  13. JAZ8 lacks a canonical degron and has an EAR motif that mediates transcriptional repression of jasmonate responses in Arabidopsis.

    PubMed

    Shyu, Christine; Figueroa, Pablo; Depew, Cody L; Cooke, Thomas F; Sheard, Laura B; Moreno, Javier E; Katsir, Leron; Zheng, Ning; Browse, John; Howe, Gregg A

    2012-02-01

    The lipid-derived hormone jasmonoyl-L-Ile (JA-Ile) initiates large-scale changes in gene expression by stabilizing the interaction of JASMONATE ZIM domain (JAZ) repressors with the F-box protein CORONATINE INSENSITIVE1 (COI1), which results in JAZ degradation by the ubiquitin-proteasome pathway. Recent structural studies show that the JAZ1 degradation signal (degron) includes a short conserved LPIAR motif that seals JA-Ile in its binding pocket at the COI1-JAZ interface. Here, we show that Arabidopsis thaliana JAZ8 lacks this motif and thus is unable to associate strongly with COI1 in the presence of JA-Ile. As a consequence, JAZ8 is stabilized against jasmonate (JA)-mediated degradation and, when ectopically expressed in Arabidopsis, represses JA-regulated growth and defense responses. These findings indicate that sequence variation in a hypervariable region of the degron affects JAZ stability and JA-regulated physiological responses. We also show that JAZ8-mediated repression depends on an LxLxL-type EAR (for ERF-associated amphiphilic repression) motif at the JAZ8 N terminus that binds the corepressor TOPLESS and represses transcriptional activation. JAZ8-mediated repression does not require the ZIM domain, which, in other JAZ proteins, recruits TOPLESS through the EAR motif-containing adaptor protein NINJA. These findings show that EAR repression domains in a subgroup of JAZ proteins repress gene expression through direct recruitment of corepressors to cognate transcription factors.

  14. Direct interaction of the Polycomb protein with Antennapedia regulatory sequences in polytene chromosomes of Drosophila melanogaster.

    PubMed Central

    Zink, B; Engström, Y; Gehring, W J; Paro, R

    1991-01-01

    The Polycomb (Pc) gene is responsible for the elaboration and maintenance of the expression pattern of the homeotic genes during development of Drosophila. In mutant Pc- embryos, homeotic transcripts are ectopically expressed, leading to abdominal transformations in all segments. From this it was suggested that PC+ acts as a repressor of homeotic gene transcription. We have mapped the cis-acting control sequences of the homeotic Antennapedia (Antp) gene regulated by Pc. Using Antp P1 and P2 promoter fragments linked to the E. coli lacZ reporter gene we show different expression patterns of beta-galactosidase (beta-gal) in transformed Pc+ and Pc- embryos. In addition we are able to visualize by immunocytochemical techniques on polytene chromosomes the direct binding of the Pc protein to the transposed cis-regulatory promoter fragments. However, short Antp P1 promoter constructs which are--due to position effects--ectopically activated in salivary glands, do not reveal a Pc binding signal. Images PMID:1671215

  15. Motivated Proteins: A web application for studying small three-dimensional protein motifs

    PubMed Central

    Leader, David P; Milner-White, E James

    2009-01-01

    Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785

  16. Definition of an extended MHC class II-peptide binding motif for the autoimmune disease-associated Lewis rat RT1.BL molecule.

    PubMed

    Wauben, M H; van der Kraan, M; Grosfeld-Stulemeyer, M C; Joosten, I

    1997-02-01

    The Lewis rat, an inbred rat strain susceptible to several well-characterized experimental autoimmune diseases, provides a good model to study peptide-mediated immunotherapy. Peptide immunotherapy focussing on the modulation of T cell responses by interfering with TCR-peptide-MHC complex formation requires the elucidation of the molecular basis of TCR-peptide-MHC interactions for an efficient design of modulatory peptides. In the Lewis rat most autoimmune-associated CD4+ T cell responses are MHC class II RT1.BL restricted. In this study, the characteristics of RT1.BL-peptide interactions were explored. A series of substitution analogs of two Lewis rat T cell epitopes was examined in a direct peptide-MHC binding assay on isolated RT1.BL molecules. Furthermore, other autoimmune-related as well as non-disease-related T cell epitopes were tested in the binding assay. This has led to the definition of an extended RT1.BL-peptide binding motif. The RT1.BL-peptide binding motif established in this study is the first described rat MHC-peptide binding motif based on direct MHC-peptide binding experiments. To predict good or intermediate RT1.BL binding peptides, T cell epitope search profiles were deduced from this motif. The motif and search profiles will greatly facilitate the prediction of modulatory peptides based on autoimmune-associated T cell epitopes and the identification of target structures in experimental autoimmune diseases in Lewis rats.

  17. A conserved motif in Tetrahymena thermophila telomerase reverse transcriptase is proximal to the RNA template and is essential for boundary definition.

    PubMed

    Akiyama, Benjamin M; Gomez, Anastassia; Stone, Michael D

    2013-07-26

    The ends of linear chromosomes are extended by telomerase, a ribonucleoprotein complex minimally consisting of a protein subunit called telomerase reverse transcriptase (TERT) and the telomerase RNA (TER). TERT functions by reverse transcribing a short template region of TER into telomeric DNA. Proper assembly of TERT and TER is essential for telomerase activity; however, a detailed understanding of how TERT interacts with TER is lacking. Previous studies have identified an RNA binding domain (RBD) within TERT, which includes three evolutionarily conserved sequence motifs: CP2, CP, and T. Here, we used site-directed hydroxyl radical probing to directly identify sites of interaction between the TERT RBD and TER, revealing that the CP2 motif is in close proximity to a conserved region of TER known as the template boundary element (TBE). Gel shift assays on CP2 mutants confirmed that the CP2 motif is an RNA binding determinant. Our results explain previous work that established that mutations to the CP2 motif of TERT and to the TBE of TER both permit misincorporation of nucleotides into the growing DNA strand beyond the canonical template. Taken together, these results suggest a model in which the CP2 motif binds the TBE to strictly define which TER nucleotides can be reverse transcribed.

  18. The OB-fold domain 1 of human POT1 recognizes both telomeric and non-telomeric DNA motifs

    PubMed Central

    Kolar, Carol; Yan, Ying; Borgstahl, Gloria E.O.; Ouellette, Michel M.

    2015-01-01

    The POT1 protein plays a critical role in telomere protection and telomerase regulation. POT1 binds single-stranded 5’-TTAGGGTTAG-3’ and forms a dimer with the TPP1 protein. The dimer is recruited to telomeres, either directly or as part of the Shelterin complex. Human POT1 contains two Oligonucleotide/Oligosaccharide Binding (OB) fold domains, OB1 and OB2, which make physical contact with the DNA. OB1 recognizes 5’-TTAGGG whereas OB2 binds to the downstream TTAG-3’. Studies of POT1 proteins from other species have shown that some of these proteins are able to recognize a broader variety of DNA ligands than expected. To explore this possibility in humans, we have used SELEX to reexamine the sequence-specificity of the protein. Using human POT1 as a selection matrix, high-affinity DNA ligands were selected from a pool of randomized single-stranded oligonucleotides. After six successive rounds of selection, two classes of high-affinity targets were obtained. The first class was composed of oligonucleotides containing a cognate POT1 binding sites (5’-TTAGGGTTAG-3’). The second and more abundant class was made of molecules that carried a novel non-telomeric consensus: 5’-TNCANNAGKKKTTAGG-3’ (where K=G/T and N=any base). Binding studies showed that these non-telomeric sites were made of an OB1-binding motif (TTAGG) and a non-telomeric motif (NT motif), with the two motifs recognized by distinct regions of the OB1 domain. POT1 interacted with these non-telomeric binding sites with high affinity and specificity, even when bound to its dimerization partner TPP1. This intrinsic ability of POT1 to recognize NT motifs raises the possibility that the protein may fulfill additional functions at certain non-telomeric locations of the genome, in perhaps gene transcription, replication, or repair. PMID:25934589

  19. The RNA recognition motif domains of RBM5 are required for RNA binding and cancer cell proliferation inhibition

    SciTech Connect

    Zhang, Lei; Zhang, Qing; Yang, Yu; Wu, Chuanfang

    2014-02-14

    Highlights: • RNA recognition motif domains of RBM5 are essential for cell proliferation inhibition. • RNA recognition motif domains of RBM5 are essential for apoptosis induction. • RNA recognition motif domains of RBM5 are essential for RNA binding. • RNA recognition motif domains of RBM5 are essential for caspase-2 alternative splicing. - Abstract: RBM5 is a known putative tumor suppressor gene that has been shown to function in cell growth inhibition by modulating apoptosis. RBM5 also plays a critical role in alternative splicing as an RNA binding protein. However, it is still unclear which domains of RBM5 are required for RNA binding and related functional activities. We hypothesized the two putative RNA recognition motif (RRM) domains of RBM5 spanning from amino acids 98–178 and 231–315 are essential for RBM5-mediated cell growth inhibition, apoptosis regulation, and RNA binding. To investigate this hypothesis, we evaluated the activities of the wide-type and mutant RBM5 gene transfer in low-RBM5 expressing A549 cells. We found that, unlike wild-type RBM5 (RBM5-wt), a RBM5 mutant lacking the two RRM domains (RBM5-ΔRRM), is unable to bind RNA, has compromised caspase-2 alternative splicing activity, lacks cell proliferation inhibition and apoptosis induction function in A549 cells. These data provide direct evidence that the two RRM domains of RBM5 are required for RNA binding and the RNA binding activity of RBM5 contributes to its function on apoptosis induction and cell growth inhibition.

  20. Tomato Pto encodes a functional N-myristoylation motif that is required for signal transduction in Nicotiana benthamiana.

    PubMed

    de Vries, Jeroen S; Andriotis, Vasilios M E; Wu, Ai-Jiuan; Rathjen, John P

    2006-01-01

    Pto kinase of tomato (Lycopersicon esculentum) confers resistance to bacterial speck disease caused by Pseudomonas syringae pv. tomato expressing avrPto or avrPtoB. Pto interacts directly with these type-III secreted effectors, leading to induction of defence responses including the hypersensitive response (HR). Signalling by Pto requires the nucleotide-binding site-leucine-rich repeat (NBS-LRR) protein Prf. Little is known of how Pto is controlled prior to or during stimulation, although kinase activity is required for Avr-dependent activation. Here we demonstrate a role for the N-terminus in signalling by Pto. N-terminal residues outside the kinase domain were required for induction of the HR in Nicotiana benthamiana. The N-terminus also contributed to both AvrPto-binding and phosphorylation abilities. Pto residues 1-10 comprise a consensus motif for covalent attachment of myristate, a hydrophobic 14-carbon saturated fatty acid, to the Gly-2 residue. Several lines of evidence indicate that this motif is important for Pto function. A heterologous N-myristoylation motif complemented N-terminal deletion mutants of Pto for Prf-dependent signalling. Signalling by wild-type and mutant forms of Pto was strictly dependent on the Gly-2 residue. The N-myristoylation motif of Pto complemented the cognate motif of AvrPto for avirulence function and membrane association. Furthermore, Pto was myristoylated in vivo dependent on the presence of Gly-2. The subcellular localization of Pto was independent of N-myristoylation, indicating that N-myristoylation is required for some function other than membrane affinity. Consistent with this idea, AvrPtoB was also found to be a soluble protein. The data indicate an important role(s) for the myristoylated N-terminus in Pto signalling.

  1. Oligonucleotide motifs that disappear during the evolution of influenza virus in humans increase alpha interferon secretion by plasmacytoid dendritic cells.

    PubMed

    Jimenez-Baranda, Sonia; Greenbaum, Benjamin; Manches, Olivier; Handler, Jesse; Rabadán, Raúl; Levine, Arnold; Bhardwaj, Nina

    2011-04-01

    CpG motifs in an A/U context have been preferentially eliminated from classical H1N1 influenza virus genomes during virus evolution in humans. The hypothesis of the current work is that CpG motifs in a uracil context represent sequence patterns with the capacity to induce an immune response, and the avoidance of this immunostimulatory signal is the reason for the observed preferential decline. To analyze the immunogenicity of these domains, we used plasmacytoid dendritic cells (pDCs). pDCs express pattern recognition receptors, including Toll-like receptor 7 (TLR7), which recognizes guanosine- and uridine-rich viral single-stranded RNA (ssRNA), including influenza virus ssRNA. The signaling through TLR7 results in the induction of inflammatory cytokines and type I interferon (IFN-I), an essential process for the induction of specific adaptive immune responses and for mounting a robust antiviral response mediated by IFN-α. Secretion of IFN-α is also linked to the activation of other immune cells, potentially amplifying the effect of an initial IFN-α secretion. We therefore also examined the role of IFN-α-driven activation of NK cells as another source of selective pressure on the viral genome. We found direct evidence that CpG RNA motifs in a U-rich context control pDC activation and IFN-α-driven activation of NK cells, likely through TLR7. These data provide a potential explanation for the loss of CpG motifs from avian influenza viruses as they adapt to mammalian hosts. The selective decrease of CpG motifs surrounded by U/A may be a viral strategy to avoid immune recognition, a strategy likely shared by highly expressed human immune genes.

  2. Type VIa β-turn-fused helix N-termini: A novel helix N-cap motif containing cis proline.

    PubMed

    Dasgupta, Rubin; Ganguly, Himal K; Modugula, E K; Basu, Gautam

    2017-01-01

    Helix N-capping motifs often form hydrogen bonds with terminal amide groups which otherwise would be free. Also, without an amide hydrogen, proline (trans) is over-represented at helix N-termini (N1 position) because this naturally removes the need to hydrogen bond one terminal amide. However, the preference of cisPro, vis-à-vis helix N-termini, is not known. We show that cisPro (αR or PPII ) often appears at the N-cap position (N0) of helices. The N-cap cisPro(αR ) is associated with a six-residue sequence motif - X(-2) -X(-1) -cisPro-X(1) -X(2) -X(3) - with preference for Glu/Gln at X(-1) , Phe/Tyr/Trp at X(1) and Ser/Thr at X(3) . The motif, formed by the fusion of a helix and a type VIa β-turn, contains a hydrogen bond between the side chain of X(-1) and the side chain/backbone of X(3) , a α-helical hydrogen bond between X(-2) and X(2) and stacking interaction between cisPro and an aromatic residue at X(1) . NMR experiments on peptides containing the motif and its variants showed that local interactions associated with the motif, as found in folded proteins, were not enough to significantly tilt the cis/trans equilibrium towards cisPro. This suggests that some other evolutionary pressure must select the cisPro motif (over transPro) at helix N-termini. Database analysis showed that >C = O of the pre-cisPro(αR ) residue at the helix N-cap, directed opposite to the N→C helical axis, participates in long-range interactions. We hypothesize that the cisPro(αR ) motif is preferred at helix N-termini because it allows the helix to participate in long-range interactions that may be structurally and functionally important.

  3. A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching

    PubMed Central

    Romero, José R.; Carballido, Jessica A.; Garbus, Ingrid; Echenique, Viviana C.; Ponzoni, Ignacio

    2016-01-01

    The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa, revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka. PMID:27812277

  4. FPGA implementation of motifs-based neuronal network and synchronization analysis

    NASA Astrophysics Data System (ADS)

    Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao

    2016-06-01

    Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.

  5. Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

    PubMed

    Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

    2015-06-01

    Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment.

  6. A Bioinformatics Approach for Detecting Repetitive Nested Motifs using Pattern Matching.

    PubMed

    Romero, José R; Carballido, Jessica A; Garbus, Ingrid; Echenique, Viviana C; Ponzoni, Ignacio

    2016-01-01

    The identification of nested motifs in genomic sequences is a complex computational problem. The detection of these patterns is important to allow the discovery of transposable element (TE) insertions, incomplete reverse transcripts, deletions, and/or mutations. In this study, a de novo strategy for detecting patterns that represent nested motifs was designed based on exhaustive searches for pairs of motifs and combinatorial pattern analysis. These patterns can be grouped into three categories, motifs within other motifs, motifs flanked by other motifs, and motifs of large size. The methodology used in this study, applied to genomic sequences from the plant species Aegilops tauschii and Oryza sativa, revealed that it is possible to identify putative nested TEs by detecting these three types of patterns. The results were validated through BLAST alignments, which revealed the efficacy and usefulness of the new method, which is called Mamushka.

  7. Disordered amyloidogenic peptides may insert into the membrane and assemble into common cyclic structural motifs

    PubMed Central

    Jang, Hyunbum; Arce, Fernando Teran; Ramachandran, Srinivasan; Kagan, Bruce L.; Lal, Ratnesh; Nussinov, Ruth

    2014-01-01

    Aggregation of disordered amyloidogenic peptides into oligomers is the causative agent of amyloid-related diseases. In solution, disordered protein states are characterized by heterogeneous ensembles. Among these, β-rich conformers self-assemble via a conformational selection mechanism to form energetically-favored cross-β structures, regardless of their precise sequences. These disordered peptides can also penetrate the membrane, and electrophysiological data indicate that they form ion-conducting channels. Based on these and additional data, including imaging and molecular dynamic simulations of a range of amyloid peptides, Alzheimer’s amyloid-β (Aβ) peptide, its disease-related variants with point mutations and N-terminal truncated species, other amyloidogenic peptides, as well as a cytolytic peptide and a synthetic gel-forming peptide, we suggest that disordered amyloidogenic peptides can also present a common motif in the membrane. The motif consists of curved, moon-like β-rich oligomers associated into annular organizations. The motif is favored in the lipid bilayer since it permits hydrophobic side chains to face and interact with the membrane and the charged/polar residues to face the solvated channel pores. Such channels are toxic since their pores allow uncontrolled leakage of ions into/out of the cell, destabilizing cellular ionic homeostasis. Here we detail Aβ, whose aggregation is associated with Alzheimer’s disease (AD) and for which there are the most abundant data. AD is a protein misfolding disease characterized by a build-up of Aβ peptide as senile plaques, neurodegeneration, and memory loss. Excessively produced Aβ peptides may directly induce cellular toxicity, even without the involvement of membrane receptors through Aβ peptide-plasma membrane interactions. PMID:24566672

  8. Characterization of human TCR Vbeta gene promoter. Role of the dodecamer motif in promoter activity.

    PubMed

    Deng, X; Sun, G R; Zheng, Q; Li, Y

    1998-09-11

    During T-lymphocyte development, the T-cell antigen receptor (TCR) gene expression is controlled by its promoter and enhancer elements and regulated in tissue- and development stage-specific manner. To uncover the promoter function and to define positive and negative regulatory elements in TCR gene promoters, the promoter activities from 13 human TCR Vbeta genes were determined by the transient transfection system and luciferase reporter assay. Although most of the TCR Vbeta gene promoters that we tested are inactive by themselves, some promoters were found to be constitutively strong. Among them, Vbeta6.7 is the strongest. 5'-Deletion and fragmentation experiments have narrowed the full promoter activity of Vbeta6.7 to a fragment of 147 base pairs immediately 5' to the transcription initiation site. A decanucleotide motif with the consensus sequence AGTGAYRTCA has been found to be conserved in most TCR Vbeta gene promoters. There are three such decamer motifs in the promoter region of Vbeta6.7, and the contribution of each such motif to the promoter activity has been examined. Further site-directed mutagenesis analyses showed that: 1) when two Ts in the decamer were mutated, the promoter activity was totally abolished; 2) when two additional nucleotides 3' to the end of decamer were mutated, the promoter activity was decreased to two-thirds of the full level; and 3) when the element with the sequence AGTGATGTCACT was inserted into other promoters, the original weak promoters become very strong. Taken together, our data suggest that the positive regulatory element in Vbeta6.7 should be considered a dodecamer rather than a decamer and that it confers strong basal transcriptional activity on TCR Vbeta genes.

  9. Transcription factor binding site positioning in yeast: proximal promoter motifs characterize TATA-less promoters.

    PubMed

    Erb, Ionas; van Nimwegen, Erik

    2011-01-01

    The availability of sequence specificities for a substantial fraction of yeast's transcription factors and comparative genomic algorithms for binding site prediction has made it possible to comprehensively annotate transcription factor binding sites genome-wide. Here we use such a genome-wide annotation for comprehensively studying promoter architecture in yeast, focusing on the distribution of transcription factor binding sites relative to transcription start sites, and the architecture of TATA and TATA-less promoters. For most transcription factors, binding sites are positioned further upstream and vary over a wider range in TATA promoters than in TATA-less promoters. In contrast, a group of 6 'proximal promoter motifs' (GAT1/GLN3/DAL80, FKH1/2, PBF1/2, RPN4, NDT80, and ROX1) occur preferentially in TATA-less promoters and show a strong preference for binding close to the transcription start site in these promoters. We provide evidence that suggests that pre-initiation complexes are recruited at TATA sites in TATA promoters and at the sites of the other proximal promoter motifs in TATA-less promoters. TATA-less promoters can generally be classified by the proximal promoter motif they contain, with different classes of TATA-less promoters showing different patterns of transcription factor binding site positioning and nucleosome coverage. These observations suggest that different modes of regulation of transcription initiation may be operating in the different promoter classes. In addition we show that, across all promoter classes, there is a close match between nucleosome free regions and regions of highest transcription factor binding site density. This close agreement between transcription factor binding site density and nucleosome depletion suggests a direct and general competition between transcription factors and nucleosomes for binding to promoters.

  10. Site-directed mutagenesis and saturation mutagenesis for the functional study of transcription factors involved in plant secondary metabolite biosynthesis.

    PubMed

    Pattanaik, Sitakanta; Werkman, Joshua R; Kong, Que; Yuan, Ling

    2010-01-01

    Regulation of gene expression is largely coordinated by a complex network of interactions between transcription factors (TFs), co-factors, and their cognate cis-regulatory elements in the genome. TFs are multidomain proteins that arise evolutionarily through protein domain shuffling. The modular nature of TFs has led to the idea that specific modules of TFs can be re-designed to regulate desired gene(s) through protein engineering. Utilization of designer TFs for the control of metabolic pathways has emerged as an effective approach for metabolic engineering. We are interested in engineering the basic helix-loop-helix (bHLH, Myc-type) transcription factors. Using site-directed and saturation mutagenesis, in combination with efficient and high-throughput screening systems, we have identified and characterized several amino acid residues critical for higher transactivation activity of a Myc-like bHLH transcription factor involved in anthocyanin biosynthetic pathway in plants. Site-directed and saturation mutagenesis should be generally applicable to engineering of all TFs.

  11. Association of branched oligonucleotides into the i-motif.

    PubMed

    Robidoux, S; Klinck, R; Gehring, K; Damha, M J

    1997-12-01

    The unique architecture of branched oligonucleotides mimicking lariat RNA introns [Wallace and Edmons, Proc. Natl. Acad. Sci. USA 80, 950-954 (1983)] was exploited to study compounds that associate as two parallel duplexes with intercalating C/C+ base pairs (i-motif DNA) [Gehring et al. Nature 363, 561-565 (1993)]. The formation of a branched cytosine tetrad was induced by joining the 5'-ends of pair of pentadeoxycytidine strands with a branching riboadenosine (rA) linker. This arrangement causes the orientation of the dC strands to be parallel, and forces the formation of a C/C+ duplex that self-associates into i-DNA. Presence of the i-motif in this structure is supported by thermal denaturation, native gel electrophoresis, CD, and NMR spectroscopy.

  12. Factoring local sequence composition in motif significance analysis.

    PubMed

    Ng, Patrick; Keich, Uri

    2008-01-01

    We recently introduced a biologically realistic and reliable significance analysis of the output of a popular class of motif finders. In this paper we further improve our significance analysis by incorporating local base composition information. Relying on realistic biological data simulation, as well as on FDR analysis applied to real data, we show that our method is significantly better than the increasingly popular practice of using the normal approximation to estimate the significance of a finder's output. Finally we turn to leveraging our reliable significance analysis to improve the actual motif finding task. Specifically, endowing a variant of the Gibbs Sampler with our improved significance analysis we demonstrate that de novo finders can perform better than has been perceived. Significantly, our new variant outperforms all the finders reviewed in a recently published comprehensive analysis of the Harbison genome-wide binding location data. Interestingly, many of these finders incorporate additional information such as nucleosome positioning and the significance of binding data.

  13. Finding sequence motifs in groups of functionally related proteins.

    PubMed

    Smith, H O; Annau, T M; Chandrasegaran, S

    1990-01-01

    We have developed a method for rapidly finding patterns of conserved amino acid residues (motifs) in groups of functionally related proteins. All 3-amino acid patterns in a group of proteins of the type aa1 d1 aa2 d2 aa3, where d1 and d2 are distances that can be varied in a range up to 24 residues, are accumulated into an array. Segments of the proteins containing those patterns that occur most frequently are aligned on each other by a scoring method that obtains an average relatedness value for all the amino acids in each column of the aligned sequence block based on the Dayhoff relatedness odds matrix. The automated method successfully finds and displays nearly all of the sequence motifs that have been previously reported to occur in 33 reverse transcriptases, 18 DNA integrases, and 30 DNA methyltransferases.

  14. Graph animals, subgraph sampling, and motif search in large networks.

    PubMed

    Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2007-09-01

    We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for "graph animals," i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan et al., Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of superexponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the tandem affinity purification (TAP) method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs (Z scores >10) or antimotifs (Z scores <-10) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes that connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the k cores with large k and have a decisive effect on the strongest motifs with 6-8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.

  15. A survey of motif discovery methods in an integrated framework

    PubMed Central

    Sandve, Geir Kjetil; Drabløs, Finn

    2006-01-01

    Background There has been a growing interest in computational discovery of regulatory elements, and a multitude of motif discovery methods have been proposed. Computational motif discovery has been used with some success in simple organisms like yeast. However, as we move to higher organisms with more complex genomes, more sensitive methods are needed. Several recent methods try to integrate additional sources of information, including microarray experiments (gene expression and ChlP-chip). There is also a growing awareness that regulatory elements work in combination, and that this combinatorial behavior must be modeled for successful motif discovery. However, the multitude of methods and approaches makes it difficult to get a good understanding of the current status of the field. Results This paper presents a survey of methods for motif discovery in DNA, based on a structured and well defined framework that integrates all relevant elements. Existing methods are discussed according to this framework. Conclusion The survey shows that although no single method takes all relevant elements into consideration, a very large number of different models treating the various elements separately have been tried. Very often the choices that have been made are not explicitly stated, making it difficult to compare different implementations. Also, the tests that have been used are often not comparable. Therefore, a stringent framework and improved test methods are needed to evaluate the different approaches in order to conclude which ones are most promising. Reviewers: This article was reviewed by Eugene V. Koonin, Philipp Bucher (nominated by Mikhail Gelfand) and Frank Eisenhaber. PMID:16600018

  16. Structural assessment of glycyl mutations in invariantly conserved motifs.

    PubMed

    Prakash, Tulika; Sandhu, Kuljeet Singh; Singh, Nitin Kumar; Bhasin, Yasha; Ramakrishnan, C; Brahmachari, Samir K

    2007-11-15

    Motifs that are evolutionarily conserved in proteins are crucial to their structure and function. In one of our earlier studies, we demonstrated that the conserved motifs occurring invariantly across several organisms could act as structural determinants of the proteins. We observed the abundance of glycyl residues in these invariantly conserved motifs. The role of glycyl residues in highly conserved motifs has not been studied extensively. Thus, it would be interesting to examine the structural perturbations induced by mutation in these conserved glycyl sites. In this work, we selected a representative set of invariant signature (IS) peptides for which both the PDB structure and mutation information was available. We thoroughly analyzed the conformational features of the glycyl sites and their local interactions with the surrounding residues. Using Ramachandran angles, we showed that the glycyl residues occurring in these IS peptides, which have undergone mutation, occurred more often in the L-disallowed as compared with the L-allowed region of the Ramachandran plot. Short range contacts around the mutation site were analyzed to study the steric effects. With the results obtained from our analysis, we hypothesize that any change of activity arising because of such mutations must be attributed to the long-range interaction(s) of the new residue if the glycyl residue in the IS peptide occurred in the L-allowed region of the Ramachandran plot. However, the mutation of those conserved glycyl residues that occurred in the L-disallowed region of the Ramachandran plot might lead to an altered activity of the protein as a result of an altered conformation of the backbone in the immediate vicinity of the glycyl residue, in addition to long range effects arising from the long side chains of the new residue. Thus, the loss of activity because of mutation in the conserved glycyl site might either relate to long range interactions or to local perturbations around the site

  17. Graph animals, subgraph sampling, and motif search in large networks

    NASA Astrophysics Data System (ADS)

    Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2007-09-01

    We generalize a sampling algorithm for lattice animals (connected clusters on a regular lattice) to a Monte Carlo algorithm for “graph animals,” i.e., connected subgraphs in arbitrary networks. As with the algorithm in [N. Kashtan , Bioinformatics 20, 1746 (2004)], it provides a weighted sample, but the computation of the weights is much faster (linear in the size of subgraphs, instead of superexponential). This allows subgraphs with up to ten or more nodes to be sampled with very high statistics, from arbitrarily large networks. Using this together with a heuristic algorithm for rapidly classifying isomorphic graphs, we present results for two protein interaction networks obtained using the tandem affinity purification (TAP) method: one of Escherichia coli with 230 nodes and 695 links, and one for yeast (Saccharomyces cerevisiae) with roughly ten times more nodes and links. We find in both cases that most connected subgraphs are strong motifs ( Z scores >10 ) or antimotifs ( Z scores <-10 ) when the null model is the ensemble of networks with fixed degree sequence. Strong differences appear between the two networks, with dominant motifs in E. coli being (nearly) bipartite graphs and having many pairs of nodes that connect to the same neighbors, while dominant motifs in yeast tend towards completeness or contain large cliques. We also explore a number of methods that do not rely on measurements of Z scores or comparisons with null models. For instance, we discuss the influence of specific complexes like the 26S proteasome in yeast, where a small number of complexes dominate the k cores with large k and have a decisive effect on the strongest motifs with 6-8 nodes. We also present Zipf plots of counts versus rank. They show broad distributions that are not power laws, in contrast to the case when disconnected subgraphs are included.

  18. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an object-oriented'' approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  19. Motif, the basics: an overview of the widget set

    SciTech Connect

    McClurg, F.R.

    1992-10-01

    The Motif library provides programmers with a rich set of tools for building a graphical user interface with a three-dimensional appearance and a consistent method of interaction for controlling an Unix application. This Xt-based, high-level library presents an ``object-oriented`` approach to program design for programmers and allows end-users the flexibility to modify attributes of the interface.

  20. Structure and ubiquitin binding of the ubiquitin-interacting motif

    SciTech Connect

    Fisher,R.; Wang, B.; Alam, S.; Higginson, D.; Robinson, H.; Sundquist, C.; Hill, C.

    2003-01-01

    Ubiquitylation is used to target proteins into a large number of different biological processes including proteasomal degradation, endocytosis, virus budding, and vacuolar protein sorting (Vps). Ubiquitylated proteins are typically recognized using one of several different conserved ubiquitin binding modules. Here, we report the crystal structure and ubiquitin binding properties of one such module, the ubiquitin-interacting motif (UIM). We found that UIM peptides from several proteins involved in endocytosis and vacuolar protein sorting including Hrs, Vps27p, Stam1, and Eps15 bound specifically, but with modest affinity (K{sub d} = 0.1-1 mM), to free ubiquitin. Full affinity ubiquitin binding required the presence of conserved acidic patches at the N and C terminus of the UIM, as well as highly conserved central alanine and serine residues. NMR chemical shift perturbation mapping experiments demonstrated that all of these UIM peptides bind to the I44 surface of ubiquitin. The 1.45 {angstrom} resolution crystal structure of the second yeast Vps27p UIM (Vps27p-2) revealed that the ubiquitin-interacting motif forms an amphipathic helix. Although Vps27p-2 is monomeric in solution, the motif unexpectedly crystallized as an antiparallel four-helix bundle, and the potential biological implications of UIM oligomerization are therefore discussed.

  1. Maximum likelihood density modification by pattern recognition of structural motifs

    DOEpatents

    Terwilliger, Thomas C.

    2004-04-13

    An electron density for a crystallographic structure having protein regions and solvent regions is improved by maximizing the log likelihood of a set of structures factors {F.sub.h } using a local log-likelihood function: (x)+p(.rho.(x).vertline.SOLV)p.sub.SOLV (x)+p(.rho.(x).vertline.H)p.sub.H (x)], where p.sub.PROT (x) is the probability that x is in the protein region, p(.rho.(x).vertline.PROT) is the conditional probability for .rho.(x) given that x is in the protein region, and p.sub.SOLV (x) and p(.rho.(x).vertline.SOLV) are the corresponding quantities for the solvent region, p.sub.H (x) refers to the probability that there is a structural motif at a known location, with a known orientation, in the vicinity of the point x; and p(.rho.(x).vertline.H) is the probability distribution for electron density at this point given that the structural motif actually is present. One appropriate structural motif is a helical structure within the crystallographic structure.

  2. Retroviruses integrate into a shared, non-palindromic DNA motif.

    PubMed

    Kirk, Paul D W; Huvet, Maxime; Melamed, Anat; Maertens, Goedele N; Bangham, Charles R M

    2016-11-14

    Many DNA-binding factors, such as transcription factors, form oligomeric complexes with structural symmetry that bind to palindromic DNA sequences(1). Palindromic consensus nucleotide sequences are also found at the genomic integration sites of retroviruses(2-6) and other transposable elements(7-9), and it has been suggested that this palindromic consensus arises as a consequence of the structural symmetry in the integrase complex(2,3). However, we show here that the palindromic consensus sequence is not present in individual integration sites of human T-cell lymphotropic virus type 1 (HTLV-1) and human immunodeficiency virus type 1 (HIV-1), but arises in the population average as a consequence of the existence of a non-palindromic nucleotide motif that occurs in approximately equal proportions on the plus strand and the minus strand of the host genome. We develop a generally applicable algorithm to sort the individual integration site sequences into plus-strand and minus-strand subpopulations, and use this to identify the integration site nucleotide motifs of five retroviruses of different genera: HTLV-1, HIV-1, murine leukaemia virus (MLV), avian sarcoma leucosis virus (ASLV) and prototype foamy virus (PFV). The results reveal a non-palindromic motif that is shared between these retroviruses.

  3. STEME: efficient EM to find motifs in large data sets.

    PubMed

    Reid, John E; Wernisch, Lorenz

    2011-10-01

    MEME and many other popular motif finders use the expectation-maximization (EM) algorithm to optimize their parameters. Unfortunately, the running time of EM is linear in the length of the input sequences. This can prohibit its application to data sets of the size commonly generated by high-throughput biological techniques. A suffix tree is a data structure that can efficiently index a set of sequences. We describe an algorithm, Suffix Tree EM for Motif Elicitation (STEME), that approximates EM using suffix trees. To the best of our knowledge, this is the first application of suffix trees to EM. We provide an analysis of the expected running time of the algorithm and demonstrate that STEME runs an order of magnitude more quickly than the implementation of EM used by MEME. We give theoretical bounds for the quality of the approximation and show that, in practice, the approximation has a negligible effect on the outcome. We provide an open source implementation of the algorithm that we hope will be used to speed up existing and future motif search algorithms.

  4. An update on cell surface proteins containing extensin-motifs.

    PubMed

    Borassi, Cecilia; Sede, Ana R; Mecchia, Martin A; Salgado Salter, Juan D; Marzol, Eliana; Muschietti, Jorge P; Estevez, Jose M

    2016-01-01

    In recent years it has become clear that there are several molecular links that interconnect the plant cell surface continuum, which is highly important in many biological processes such as plant growth, development, and interaction with the environment. The plant cell surface continuum can be defined as the space that contains and interlinks the cell wall, plasma membrane and cytoskeleton compartments. In this review, we provide an updated view of cell surface proteins that include modular domains with an extensin (EXT)-motif followed by a cytoplasmic kinase-like domain, known as PERKs (for proline-rich extensin-like receptor kinases); with an EXT-motif and an actin binding domain, known as formins; and with extracellular hybrid-EXTs. We focus our attention on the EXT-motifs with the short sequence Ser-Pro(3-5), which is found in several different protein contexts within the same extracellular space, highlighting a putative conserved structural and functional role. A closer understanding of the dynamic regulation of plant cell surface continuum and its relationship with the downstream signalling cascade is a crucial forthcoming challenge.

  5. The bioactive acidic serine- and aspartate-rich motif peptide.

    PubMed

    Minamizaki, Tomoko; Yoshiko, Yuji

    2015-01-01

    The organic component of the bone matrix comprises 40% dry weight of bone. The organic component is mostly composed of type I collagen and small amounts of non-collagenous proteins (NCPs) (10-15% of the total bone protein content). The small integrin-binding ligand N-linked glycoprotein (SIBLING) family, a NCP, is considered to play a key role in bone mineralization. SIBLING family of proteins share common structural features and includes the arginine-glycine-aspartic acid (RGD) motif and acidic serine- and aspartic acid-rich motif (ASARM). Clinical manifestations of gene mutations and/or genetically modified mice indicate that SIBLINGs play diverse roles in bone and extraskeletal tissues. ASARM peptides might not be primary responsible for the functional diversity of SIBLINGs, but this motif is suggested to be a key domain of SIBLINGs. However, the exact function of ASARM peptides is poorly understood. In this article, we discuss the considerable progress made in understanding the role of ASARM as a bioactive peptide.

  6. QuateXelero: An Accelerated Exact Network Motif Detection Algorithm

    PubMed Central

    Khakabimamaghani, Sahand; Sharafuddin, Iman; Dichter, Norbert; Koch, Ina; Masoudi-Nejad, Ali

    2013-01-01

    Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network. PMID:23874498

  7. A novel swarm intelligence algorithm for finding DNA motifs.

    PubMed

    Lei, Chengwei; Ruan, Jianhua

    2009-01-01

    Discovering DNA motifs from co-expressed or co-regulated genes is an important step towards deciphering complex gene regulatory networks and understanding gene functions. Despite significant improvement in the last decade, it still remains one of the most challenging problems in computational molecular biology. In this work, we propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimisation technique called Particle Swarm Optimisation (PSO), which has been shown to be effective in optimising difficult multidimensional problems in continuous domains. We propose to use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs, and propose a modification of the naive PSO algorithm to accommodate discrete variables. In order to improve efficiency, we also propose several strategies for escaping from local optima and for automatically determining the termination criteria. Experimental results on simulated challenge problems show that our method is both more efficient and more accurate than several existing algorithms. Applications to several sets of real promoter sequences also show that our approach is able to detect known transcription factor binding sites, and outperforms two of the most popular existing algorithms.

  8. MAR characteristic motifs mediate episomal vector in CHO cells.

    PubMed

    Lin, Yan; Li, Zhaoxi; Wang, Tianyun; Wang, Xiaoyin; Wang, Li; Dong, Weihua; Jing, Changqin; Yang, Xianjun

    2015-04-01

    An ideal gene therapy vector should enable persistent transgene expression without limitations in safety and reproducibility. Recent researches' insight into the ability of chromosomal matrix attachment regions (MARs) to mediate episomal maintenance of genetic elements allowed the development of a circular episomal vector. Although a MAR-mediated engineered vector has been developed, little is known on which motifs of MAR confer this function during interaction with the host genome. Here, we report an artificially synthesized DNA fragment containing only characteristic motif sequences that served as an alternative to human beta-interferon matrix attachment region sequence. The potential of the vector to mediate gene transfer in CHO cells was investigated. The short synthetic MAR motifs were found to mediate episomal vector at a low copy number for many generations without integration into the host genome. Higher transgene expression was maintained for at least 4 months. In addition, MAR was maintained episomally and conferred sustained EGFP expression even in nonselective CHO cells. All the results demonstrated that MAR characteristic sequence-based vector can function as stable episomes in CHO cells, supporting long-term and effective transgene expression.

  9. Event Networks and the Identification of Crime Pattern Motifs

    PubMed Central

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible. PMID:26605544

  10. Event Networks and the Identification of Crime Pattern Motifs.

    PubMed

    Davies, Toby; Marchione, Elio

    2015-01-01

    In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible.

  11. Automatic Network Fingerprinting through Single-Node Motifs

    PubMed Central

    Echtermeyer, Christoph; da Fontoura Costa, Luciano; Rodrigues, Francisco A.; Kaiser, Marcus

    2011-01-01

    Complex networks have been characterised by their specific connectivity patterns (network motifs), but their building blocks can also be identified and described by node-motifs—a combination of local network features. One technique to identify single node-motifs has been presented by Costa et al. (L. D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett., 87, 1, 2009). Here, we first suggest improvements to the method including how its parameters can be determined automatically. Such automatic routines make high-throughput studies of many networks feasible. Second, the new routines are validated in different network-series. Third, we provide an example of how the method can be used to analyse network time-series. In conclusion, we provide a robust method for systematically discovering and classifying characteristic nodes of a network. In contrast to classical motif analysis, our approach can identify individual components (here: nodes) that are specific to a network. Such special nodes, as hubs before, might be found to play critical roles in real-world networks. PMID:21297963

  12. The mammalian heterochromatin protein 1 binds diverse nuclear proteins through a common motif that targets the chromoshadow domain

    SciTech Connect

    Lechner, Mark S. . E-mail: msl27@drexel.edu; Schultz, David C.; Negorev, Dmitri; Maul, Gerd G.; Rauscher, Frank J.

    2005-06-17

    The HP1 proteins regulate epigenetic gene silencing by promoting and maintaining chromatin condensation. The HP1 chromodomain binds to methylated histone H3. More enigmatic is the chromoshadow domain (CSD), which mediates dimerization, transcription repression, and interaction with multiple nuclear proteins. Here we show that KAP-1, CAF-1 p150, and NIPBL carry a canonical amino acid motif, PxVxL, which binds directly to the CSD with high affinity. We also define a new class of variant PxVxL CSD-binding motifs in Sp100A, LBR, and ATRX. Both canonical and variant motifs recognize a similar surface of the CSD dimer as demonstrated by a panel of CSD mutants. These in vitro binding results were confirmed by the analysis of polypeptides found associated with nuclear HP1 complexes and we provide the first evidence of the NIPBL/delangin protein in human cells, a protein recently implicated in the developmental disorder, Cornelia de Lange syndrome. NIPBL is related to Nipped-B, a factor participating in gene activation by remote enhancers in Drosophila melanogaster. Thus, this spectrum of direct binding partners suggests an expanded role for HP1 as factor participating in promoter-enhancer communication, chromatin remodeling/assembly, and sub-nuclear compartmentalization.

  13. SH3 domains of Grb2 adaptor bind to PXpsiPXR motifs within the Sos1 nucleotide exchange factor in a discriminate manner.

    PubMed

    McDonald, Caleb B; Seldeen, Kenneth L; Deegan, Brian J; Farooq, Amjad

    2009-05-19

    Ubiquitously encountered in a wide variety of cellular processes, the Grb2-Sos1 interaction is mediated through the combinatorial binding of nSH3 and cSH3 domains of Grb2 to various sites containing PXpsiPXR motifs within Sos1. Here, using isothermal titration calorimetry, we demonstrate that while the nSH3 domain binds with affinities in the physiological range to all four sites containing PXpsiPXR motifs, designated S1, S2, S3, and S4, the cSH3 domain can only do so at the S1 site. Further scrutiny of these sites yields rationale for the recognition of various PXpsiPXR motifs by the SH3 domains in a discriminate manner. Unlike the PXpsiPXR motifs at S2, S3, and S4 sites, the PXpsiPXR motif at the S1 site is flanked at its C-terminus with two additional arginine residues that are absolutely required for high-affinity binding of the cSH3 domain. In striking contrast, these two additional arginine residues augment the binding of the nSH3 domain to the S1 site, but their role is not critical for the recognition of S2, S3, and S4 sites. Site-directed mutagenesis suggests that the two additional arginine residues flanking the PXpsiPXR motif at the S1 site contribute to free energy of binding via the formation of salt bridges with specific acidic residues in SH3 domains. Molecular modeling is employed to project these novel findings into the 3D structures of SH3 domains in complex with a peptide containing the PXpsiPXR motif and flanking arginine residues at the S1 site. Taken together, this study furthers our understanding of the assembly of a key signaling complex central to cellular machinery.

  14. The in vivo role of androgen receptor SUMOylation as revealed by androgen insensitivity syndrome and prostate cancer mutations targeting the proline/glycine residues of synergy control motifs.

    PubMed

    Mukherjee, Sarmistha; Cruz-Rodríguez, Osvaldo; Bolton, Eric; Iñiguez-Lluhí, Jorge A

    2012-09-07

    The androgen receptor (AR) mediates the effects of male sexual hormones on development and physiology. Alterations in AR function are central to reproductive disorders, prostate cancer, and Kennedy disease. AR activity is influenced by post-translational modifications, but their role in AR-based diseases is poorly understood. Conjugation by small ubiquitin-like modifier (SUMO) proteins at two synergy control (SC) motifs in AR exerts a promoter context-dependent inhibitory role. SC motifs are composed of a four-amino acid core that is often preceded and/or followed by nearby proline or glycine residues. The function of these flanking residues, however, has not been examined directly. Remarkably, several AR mutations associated with oligospermia and androgen insensitivity syndrome map to Pro-390, the conserved proline downstream of the first SC motif in AR. Similarly, mutations at Gly-524, downstream of the second SC motif, were recovered in recurrent prostate cancer samples. We now provide evidence that these clinically isolated substitutions lead to a partial loss of SC motif function and AR SUMOylation that affects multiple endogenous genes. Consistent with a structural role as terminators of secondary structure elements, substitution of Pro-390 by Gly fully supports both SC motif function and SUMOylation. As predicted from the functional properties of SC motifs, the clinically isolated mutations preferentially enhance transcription driven by genomic regions harboring multiple AR binding sites. The data support the view that alterations in AR SUMOylation play significant roles in AR-based diseases and offer novel SUMO-based therapeutic opportunities.

  15. SH3 Domains of Grb2 Adaptor Bind to PXψPXR Motifs Within the Sos1 Nucleotide Exchange Factor in a Discriminate Manner†

    PubMed Central

    McDonald, Caleb B.; Seldeen, Kenneth L.; Deegan, Brian J.; Farooq, Amjad

    2009-01-01

    Ubiquitously encountered in a wide variety of cellular processes, the Grb2-Sos1 interaction is mediated through the combinatorial binding of nSH3 and cSH3 domains of Grb2 to various sites containing PXψPXR motifs within Sos1. Here, using isothermal titration calorimetry, we demonstrate that while the nSH3 domain binds with affinities in the physiological range to all four sites containing PXψPXR motifs, designated S1, S2, S3 and S4, the cSH3 domain can only do so at S1 site. Further scrutiny of these sites yields rationale for the recognition of various PXψPXR motifs by the SH3 domains in a discriminate manner. Unlike the PXψPXR motifs at S2, S3 and S4 sites, the PXψPXR motif at S1 site is flanked at its C-terminus with two additional arginine residues that are absolutely required for high-affinity binding of cSH3 domain. In striking contrast, these two additional arginine residues augment the binding of nSH3 domain to S1 site but their role is not critical for the recognition of S2, S3 and S4 sites. Site-directed mutagenesis suggests that the two additional arginine residues flanking the PXψPXR motif at S1 site contribute to free energy of binding via the formation of salt bridges with specific acidic residues in SH3 domains. Molecular modeling is employed to project these novel findings into the 3D structures of SH3 domains in complex with a peptide containing the PXψPXR motif and flanking arginine residues at S1 site. Taken together, this study furthers our understanding of the assembly of a key signaling complex central to cellular machinery. PMID:19323566

  16. Long inverted repeats are an at-risk motif for recombination in mammalian cells.

    PubMed

    Waldman, A S; Tran, H; Goldsmith, E C; Resnick, M A

    1999-12-01

    Certain DNA sequence motifs and structures can promote genomic instability. We have explored instability induced in mouse cells by long inverted repeats (LIRs). A cassette was constructed containing a herpes simplex virus thymidine kinase (tk) gene into which was inserted an LIR composed of two inverted copies of a 1.1-kb yeast URA3 gene sequence separated by a 200-bp spacer sequence. The tk gene was introduced into the genome of mouse Ltk(-) fibroblasts either by itself or in conjunction with a closely linked tk gene that was disrupted by an 8-bp XhoI linker insertion; rates of intrachromosomal homologous recombination between the markers were determined. Recombination between the two tk alleles was stimulated 5-fold by the LIR, as compared to a long direct repeat (LDR) insert, resulting in nearly 10(-5) events per cell per generation. Of the tk(+) segregants recovered from LIR-containing cell lines, 14% arose from gene conversions that eliminated the LIR, as compared to 3% of the tk(+) segregants from LDR cell lines, corresponding to a >20-fold increase in deletions at the LIR hotspot. Thus, an LIR, which is a common motif in mammalian genomes, is at risk for the stimulation of homologous recombination and possibly other genetic rearrangements.

  17. Structure of a (Cys3His) zinc ribbon, a ubiquitous motif in archaeal and eucaryal transcription.

    PubMed Central

    Chen, H. T.; Legault, P.; Glushka, J.; Omichinski, J. G.; Scott, R. A.

    2000-01-01

    Transcription factor IIB (TFIIB) is an essential component in the formation of the transcription initiation complex in eucaryal and archaeal transcription. TFIIB interacts with a promoter complex containing the TATA-binding protein (TBP) to facilitate interaction with RNA polymerase II (RNA pol II) and the associated transcription factor IIF (TFIIF). TFIIB contains a zinc-binding motif near the N-terminus that is directly involved in the interaction with RNA pol II/TFIIF and plays a crucial role in selecting the transcription initiation site. The solution structure of the N-terminal residues 2-59 of human TFIIB was determined by multidimensional NMR spectroscopy. The structure consists of a nearly tetrahedral Zn(Cys)3(His)1 site confined by type I and "rubredoxin" turns, three antiparallel beta-strands, and disordered loops. The structure is similar to the reported zinc-ribbon motifs in several transcription-related proteins from archaea and eucarya, including Pyrococcus furiosus transcription factor B (PfTFB), human and yeast transcription factor IIS (TFIIS), and Thermococcus celer RNA polymerase II subunit M (TcRPOM). The zinc-ribbon structure of TFIIB, in conjunction with the biochemical analyses, suggests that residues on the beta-sheet are involved in the interaction with RNA pol II/TFIIF, while the zinc-binding site may increase the stability of the beta-sheet. PMID:11045620

  18. An isoprenylation and palmitoylation motif promotes intraluminal vesicle delivery of proteins in cells from distant species.

    PubMed

    Oeste, Clara L; Pinar, Mario; Schink, Kay O; Martínez-Turrión, Javier; Stenmark, Harald; Peñalva, Miguel A; Pérez-Sala, Dolores

    2014-01-01

    The C-terminal ends of small GTPases contain hypervariable sequences which may be posttranslationally modified by defined lipid moieties. The diverse structural motifs generated direct proteins towards specific cellular membranes or organelles. However, knowledge on the factors that determine these selective associations is limited. Here we show, using advanced microscopy, that the isoprenylation and palmitoylation motif of human RhoB (-CINCCKVL) targets chimeric proteins to intraluminal vesicles of endolysosomes in human cells, displaying preferential co-localization with components of the late endocytic pathway. Moreover, this distribution is conserved in distant species, including cells from amphibians, insects and fungi. Blocking lipidic modifications results in accumulation of CINCCKVL chimeras in the cytosol, from where they can reach endolysosomes upon release of this block. Remarkably, CINCCKVL constructs are sorted to intraluminal vesicles in a cholesterol-dependent process. In the lower species, neither the C-terminal sequence of RhoB, nor the endosomal distribution of its homologs are conserved; in spite of this, CINCCKVL constructs also reach endolysosomes in Xenopus laevis and insect cells. Strikingly, this behavior is prominent in the filamentous ascomycete fungus Aspergillus nidulans, in which GFP-CINCCKVL is sorted into endosomes and vacuoles in a lipidation-dependent manner and allows monitoring endosomal movement in live fungi. In summary, the isoprenylated and palmitoylated CINCCKVL sequence constitutes a specific structure which delineates an endolysosomal sorting strategy operative in phylogenetically diverse organisms.

  19. C-terminal motif of human neuropeptide Y4 receptor determines internalization and arrestin recruitment.

    PubMed

    Wanka, Lizzy; Babilon, Stefanie; Burkert, Kerstin; Mörl, Karin; Gurevich, Vsevolod V; Beck-Sickinger, Annette G

    2017-01-01

    The human neuropeptide Y4 receptor is a rhodopsin-like G protein-coupled receptor (GPCR), which contributes to anorexigenic signals. Thus, this receptor is a highly interesting target for metabolic diseases. As GPCR internalization and trafficking affect receptor signaling and vice versa, we aimed to investigate the molecular mechanism of hY4R desensitization and endocytosis. The role of distinct segments of the hY4R carboxyl terminus was investigated by fluorescence microscopy, binding assays, inositol turnover experiments and bioluminescence resonance energy transfer assays to examine the internalization behavior of hY4R and its interaction with arrestin-3. Based on results of C-terminal deletion mutants and substitution of single amino acids, the motif (7.78)EESEHLPLSTVHTEVSKGS(7.96) was identified, with glutamate, threonine and serine residues playing key roles, based on site-directed mutagenesis. Thus, we identified the internalization motif for the human neuropeptide Y4 receptor, which regulates arrestin-3 recruitment and receptor endocytosis.

  20. An Isoprenylation and Palmitoylation Motif Promotes Intraluminal Vesicle Delivery of Proteins in Cells from Distant Species

    PubMed Central

    Oeste, Clara L.; Pinar, Mario; Schink, Kay O.; Martínez-Turrión, Javier; Stenmark, Harald; Peñalva, Miguel A.; Pérez-Sala, Dolores

    2014-01-01

    The C-terminal ends of small GTPases contain hypervariable sequences which may be posttranslationally modified by defined lipid moieties. The diverse structural motifs generated direct proteins towards specific cellular membranes or organelles. However, knowledge on the factors that determine these selective associations is limited. Here we show, using advanced microscopy, that the isoprenylation and palmitoylation motif of human RhoB (–CINCCKVL) targets chimeric proteins to intraluminal vesicles of endolysosomes in human cells, displaying preferential co-localization with components of the late endocytic pathway. Moreover, this distribution is conserved in distant species, including cells from amphibians, insects and fungi. Blocking lipidic modifications results in accumulation of CINCCKVL chimeras in the cytosol, from where they can reach endolysosomes upon release of this block. Remarkably, CINCCKVL constructs are sorted to intraluminal vesicles in a cholesterol-dependent process. In the lower species, neither the C-terminal sequence of RhoB, nor the endosomal distribution of its homologs are conserved; in spite of this, CINCCKVL constructs also reach endolysosomes in Xenopus laevis and insect cells. Strikingly, this behavior is prominent in the filamentous ascomycete fungus Aspergillus nidulans, in which GFP-CINCCKVL is sorted into endosomes and vacuoles in a lipidation-dependent manner and allows monitoring endosomal movement in live fungi. In summary, the isoprenylated and palmitoylated CINCCKVL sequence constitutes a specific structure which delineates an endolysosomal sorting strategy operative in phylogenetically diverse organisms. PMID:25207810

  1. Binding Mode of Acetylated Histones to Bromodomains: Variations on a Common Motif.

    PubMed

    Marchand, Jean-Rémy; Caflisch, Amedeo

    2015-08-01

    Bromodomains, epigenetic readers that recognize acetylated lysine residues in histone tails, are potential drug targets in cancer and inflammation. Herein we review the crystal structures of human bromodomains in complex with histone tails and analyze the main interaction motifs. The histone backbone is extended and occupies, in one of the two possible orientations, the bromodomain surface groove lined by the ZA and BC loops. The acetyl-lysine side chain is buried in the cavity between the four helices of the bromodomain, and its oxygen atom accepts hydrogen bonds from a structural water molecule and a conserved asparagine residue in the BC loop. In stark contrast to this common binding motif, a large variety of ancillary interactions emerge from our analysis. In 10 of 26 structures, a basic side chain (up to five residues up- or downstream in sequence with respect to the acetyl-lysine) interacts with the carbonyl groups of the C-terminal turn of helix αB. Furthermore, the complexes reveal many heterogeneous backbone hydrogen bonds (direct or water-bridged). These interactions contribute unselectively to the binding of acetylated histone tails to bromodomains, which provides further evidence that specific recognition is modulated by combinations of multiple histone modifications and multiple modules of the proteins involved in transcription.

  2. A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval

    PubMed Central

    Margalit, Hanah; Friedman, Nir

    2008-01-01

    Characterizing the DNA-binding specificities of transcription factors is a key problem in computational biology that has been addressed by multiple algorithms. These usually take as input sequences that are putatively bound by the same factor and output one or more DNA motifs. A common practice is to apply several such algorithms simultaneously to improve coverage at the price of redundancy. In interpreting such results, two tasks are crucial: clustering of redundant motifs, and attributing the motifs to transcription factors by retrieval of similar motifs from previously characterized motif libraries. Both tasks inherently involve motif comparison. Here we present a novel method for comparing and merging motifs, based on Bayesian probabilistic principles. This method takes into account both the similarity in positional nucleotide distributions of the two motifs and their dissimilarity to the background distribution. We demonstrate the use of the new comparison method as a basis for motif clustering and retrieval procedures, and compare it to several commonly used alternatives. Our results show that the new method outperforms other available methods in accuracy and sensitivity. We incorporated the resulting motif clustering and retrieval procedures in a large-scale automated pipeline for analyzing DNA motifs. This pipeline integrates the results of various DNA motif discovery algorithms and automatically merges redundant motifs from multiple training sets into a coherent annotated library of motifs. Application of this pipeline to recent genome-wide transcription factor location data in S. cerevisiae successfully identified DNA motifs in a manner that is as good as semi-automated analysis reported in the literature. Moreover, we show how this analysis elucidates the mechanisms of condition-specific preferences of transcription factors. PMID:18463706

  3. Decreased RNA-binding motif 5 expression is associated with tumor progression in gastric cancer.

    PubMed

    Kobayashi, Takahiko; Ishida, Junich; Shimizu, Yuichi; Kawakami, Hiroshi; Suda, Goki; Muranaka, Tetsuhito; Komatsu, Yoshito; Asaka, Masahiro; Sakamoto, Naoya

    2017-03-01

    RNA-binding motif 5 is a putative tumor suppressor gene that modulates cell cycle arrest and apoptosis. We recently demonstrated that RNA-binding motif 5 inhibits cell growth through the p53 pathway. This study evaluated the clinical significance of RNA-binding motif 5 expression in gastric cancer and the effects of altered RNA-binding motif 5 expression on cancer biology in gastric cancer cells. RNA-binding motif 5 protein expression was evaluated by immunohistochemistry using the surgical specimens of 106 patients with gastric cancer. We analyzed the relationships of RNA-binding motif 5 expression with clinicopathological parameters and patient prognosis. We further explored the effects of RNA-binding motif 5 downregulation with short hairpin RNA on cell growth and p53 signaling in MKN45 gastric cancer cells. Immunohistochemistry revealed that RNA-binding motif 5 expression was decreased in 29 of 106 (27.4%) gastric cancer specimens. Decreased RNA-binding motif 5 expression was correlated with histological differentiation, depth of tumor infiltration, nodal metastasis, tumor-node-metastasis stage, and prognosis. RNA-binding motif 5 silencing enhanced gastric cancer cell proliferation and decreased p53 transcriptional activity in reporter gene assays. Conversely, restoration of RNA-binding motif 5 expression suppressed cell growth and recovered p53 transactivation in RNA-binding motif 5-silenced cells. Furthermore, RNA-binding motif 5 silencing reduced the messenger RNA and protein expression of the p53 target gene p21. Our results suggest that RNA-binding motif 5 downregulation is involved in gastric cancer progression and that RNA-binding motif 5 behaves as a tumor suppressor gene in gastric cancer.

  4. Identification of cancer-related genes and motifs in the human gene regulatory network.

    PubMed

    Carson, Matthew B; Gu, Jianlei; Yu, Guangjun; Lu, Hui

    2015-08-01

    The authors investigated the regulatory network motifs and corresponding motif positions of cancer-related genes. First, they mapped disease-related genes to a transcription factor regulatory network. Next, they calculated statistically significant motifs and subsequently identified positions within these motifs that were enriched in cancer-related genes. Potential mechanisms of these motifs and positions are discussed. These results could be used to identify other disease- and cancer-related genes and could also suggest mechanisms for how these genes relate to co-occurring diseases.

  5. A colorimetric strategy based on a water-soluble conjugated polymer for sensing pH-driven conformational conversion of DNA i-motif structure.

    PubMed

    Wang, Lihua; Liu, Xingfen; Yang, Qing; Fan, Quli; Song, Shiping; Fan, Chunhai; Huang, Wei

    2010-03-15

    Using a water-soluble conjugated polymer (CP) as a sensing probe, we developed a rapid colorimetric detection strategy for pH-driven conformational conversion of DNA i-motif structure. Two sensing configurations were designed: one used CP only to detect the conversion between i-motif and random-coiled state of a C-rich single-strand DNA, the other used CP and a complementary single-strand DNA to investigate the conversion of duplex to i-motif equilibrium. All the conversions would lead to color change observed directly with naked eyes within a few minutes. The limitation of detection (LOD) is as low as 40 nM. More importantly, reversible conformational conversions by adjusting the pH of the system could also be detected.

  6. A conserved sequence extending motif III of the motor domain in the Snf2-family DNA translocase Rad54 is critical for ATPase activity.

    PubMed

    Zhang, Xiao-Ping; Janke, Ryan; Kingsley, James; Luo, Jerry; Fasching, Clare; Ehmsen, Kirk T; Heyer, Wolf-Dietrich

    2013-01-01

    Rad54 is a dsDNA-dependent ATPase that translocates on duplex DNA. Its ATPase function is essential for homologous recombination, a pathway critical for meiotic chromosome segregation, repair of complex DNA damage, and recovery of stalled or broken replication forks. In recombination, Rad54 cooperates with Rad51 protein and is required to dissociate Rad51 from heteroduplex DNA to allow access by DNA polymerases for recombination-associated DNA synthesis. Sequence analysis revealed that Rad54 contains a perfect match to the consensus PIP box sequence, a widely spread PCNA interaction motif. Indeed, Rad54 interacts directly with PCNA, but this interaction is not mediated by the Rad54 PIP box-like sequence. This sequence is located as an extension of motif III of the Rad54 motor domain and is essential for full Rad54 ATPase activity. Mutations in this motif render Rad54 non-functional in vivo and severely compromise its activities in vitro. Further analysis demonstrated that such mutations affect dsDNA binding, consistent with the location of this sequence motif on the surface of the cleft formed by two RecA-like domains, which likely forms the dsDNA binding site of Rad54. Our study identified a novel sequence motif critical for Rad54 function and showed that even perfect matches to the PIP box consensus may not necessarily identify PCNA interaction sites.

  7. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools

    PubMed Central

    Cer, Regina Z.; Donohue, Duncan E.; Mudunuri, Uma S.; Temiz, Nuri A.; Loss, Michael A.; Starner, Nathan J.; Halusa, Goran N.; Volfovsky, Natalia; Yi, Ming; Luke, Brian T.; Bacolla, Albino; Collins, Jack R.; Stephens, Robert M.

    2013-01-01

    The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance. PMID:23125372

  8. Agonist and antagonist switch DNA motifs recognized by human androgen receptor in prostate cancer

    PubMed Central

    Chen, Zhong; Lan, Xun; Thomas-Ahner, Jennifer M; Wu, Dayong; Liu, Xiangtao; Ye, Zhenqing; Wang, Liguo; Sunkel, Benjamin; Grenade, Cassandra; Chen, Junsheng; Zynger, Debra L; Yan, Pearlly S; Huang, Jiaoti; Nephew, Kenneth P; Huang, Tim H-M; Lin, Shili; Clinton, Steven K; Li, Wei; Jin, Victor X; Wang, Qianben

    2015-01-01

    Human transcription factors recognize specific DNA sequence motifs to regulate transcription. It is unknown whether a single transcription factor is able to bind to distinctly different motifs on chromatin, and if so, what determines the usage of specific motifs. By using a motif-resolution chromatin immunoprecipitation-exonuclease (ChIP-exo) approach, we find that agonist-liganded human androgen receptor (AR) and antagonist-liganded AR bind to two distinctly different motifs, leading to distinct transcriptional outcomes in prostate cancer cells. Further analysis on clinical prostate tissues reveals that the binding of AR to these two distinct motifs is involved in prostate carcinogenesis. Together, these results suggest that unique ligands may switch DNA motifs recognized by ligand-dependent transcription factors in vivo. Our findings also provide a broad mechanistic foundation for understanding ligand-specific induction of gene expression profiles. PMID:25535248

  9. TAF4, a subunit of transcription factor II D, directs promoter occupancy of nuclear receptor HNF4A during post-natal hepatocyte differentiation.

    PubMed

    Alpern, Daniil; Langer, Diana; Ballester, Benoit; Le Gras, Stephanie; Romier, Christophe; Mengus, Gabrielle; Davidson, Irwin

    2014-09-10

    The functions of the TAF subunits of mammalian TFIID in physiological processes remain poorly characterised. In this study, we describe a novel function of TAFs in directing genomic occupancy of a transcriptional activator. Using liver-specific inactivation in mice, we show that the TAF4 subunit of TFIID is required for post-natal hepatocyte maturation. TAF4 promotes pre-initiation complex (PIC) formation at post-natal expressed liver function genes and down-regulates a subset of embryonic expressed genes by increased RNA polymerase II pausing. The TAF4-TAF12 heterodimer interacts directly with HNF4A and in vivo TAF4 is necessary to maintain HNF4A-directed embryonic gene expression at post-natal stages and promotes HNF4A occupancy of functional cis-regulatory elements adjacent to the transcription start sites of post-natal expressed genes. Stable HNF4A occupancy of these regulatory elements requires TAF4-dependent PIC formation highlighting that these are mutually dependent events. Local promoter-proximal HNF4A-TFIID interactions therefore act as instructive signals for post-natal hepatocyte differentiation.

  10. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences

    PubMed Central

    2012-01-01

    Background Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools. Results This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones. Conclusions MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools. PMID:23282090

  11. DNA nanotechnology based on i-motif structures.

    PubMed

    Dong, Yuanchen; Yang, Zhongqiang; Liu, Dongsheng

    2014-06-17

    CONSPECTUS: Most biological processes happen at the nanometer scale, and understanding the energy transformations and material transportation mechanisms within living organisms has proved challenging. To better understand the secrets of life, researchers have investigated artificial molecular motors and devices over the past decade because such systems can mimic certain biological processes. DNA nanotechnology based on i-motif structures is one system that has played an important role in these investigations. In this Account, we summarize recent advances in functional DNA nanotechnology based on i-motif structures. The i-motif is a DNA quadruplex that occurs as four stretches of cytosine repeat sequences form C·CH(+) base pairs, and their stabilization requires slightly acidic conditions. This unique property has produced the first DNA molecular motor driven by pH changes. The motor is reliable, and studies show that it is capable of millisecond running speeds, comparable to the speed of natural protein motors. With careful design, the output of these types of motors was combined to drive micrometer-sized cantilevers bend. Using established DNA nanostructure assembly and functionalization methods, researchers can easily integrate the motor within other DNA assembled structures and functional units, producing DNA molecular devices with new functions such as suprahydrophobic/suprahydrophilic smart surfaces that switch, intelligent nanopores triggered by pH changes, molecular logic gates, and DNA nanosprings. Recently, researchers have produced motors driven by light and electricity, which have allowed DNA motors to be integrated within silicon-based nanodevices. Moreover, some devices based on i-motif structures have proven useful for investigating processes within living cells. The pH-responsiveness of the i-motif structure also provides a way to control the stepwise assembly of DNA nanostructures. In addition, because of the stability of the i-motif, this

  12. Motif discovery with data mining in 3D protein structure databases: discovery, validation and prediction of the U-shape zinc binding ("Huf-Zinc") motif.

    PubMed

    Maurer-Stroh, Sebastian; Gao, He; Han, Hao; Baeten, Lies; Schymkowitz, Joost; Rousseau, Frederic; Zhang, Louxin; Eisenhaber, Frank

    2013-02-01

    Data mining in protein databases, derivatives from more fundamental protein 3D structure and sequence databases, has considerable unearthed potential for the discovery of sequence motif--structural motif--function relationships as the finding of the U-shape (Huf-Zinc) motif, originally a small student's project, exemplifies. The metal ion zinc is critically involved in universal biological processes, ranging from protein-DNA complexes and transcription regulation to enzymatic catalysis and metabolic pathways. Proteins have evolved a series of motifs to specifically recognize and bind zinc ions. Many of these, so called zinc fingers, are structurally independent globular domains with discontinuous binding motifs made up of residues mostly far apart in sequence. Through a systematic approach starting from the BRIX structure fragment database, we discovered that there exists another predictable subset of zinc-binding motifs that not only have a conserved continuous sequence pattern but also share a characteristic local conformation, despite being included in totally different overall folds. While this does not allow general prediction of all Zn binding motifs, a HMM-based web server, Huf-Zinc, is available for prediction of these novel, as well as conventional, zinc finger motifs in protein sequences. The Huf-Zinc webserver can be freely accessed through this URL (http://mendel.bii.a-star.edu.sg/METHODS/hufzinc/).

  13. Identifying combinatorial regulation of transcription factors and binding motifs

    PubMed Central

    Kato, Mamoru; Hata, Naoya; Banerjee, Nilanjana; Futcher, Bruce; Zhang, Michael Q

    2004-01-01

    Background Combinatorial interaction of transcription factors (TFs) is important for gene regulation. Although various genomic datasets are relevant to this issue, each dataset provides relatively weak evidence on its own. Developing methods that can integrate different sequence, expression and localization data have become important. Results Here we use a novel method that integrates chromatin immunoprecipitation (ChIP) data with microarray expression data and with combinatorial TF-motif analysis. We systematically identify combinations of transcription factors and of motifs. The various combinations of TFs involved multiple binding mechanisms. We reconstruct a new combinatorial regulatory map of the yeast cell cycle in which cell-cycle regulation can be drawn as a chain of extended TF modules. We find that the pairwise combination of a TF for an early cell-cycle phase and a TF for a later phase is often used to control gene expression at intermediate times. Thus the number of distinct times of gene expression is greater than the number of transcription factors. We also see that some TF modules control branch points (cell-cycle entry and exit), and in the presence of appropriate signals they can allow progress along alternative pathways. Conclusions Combining different data sources can increase statistical power as demonstrated by detecting TF interactions and composite TF-binding motifs. The original picture of a chain of simple cell-cycle regulators can be extended to a chain of composite regulatory modules: different modules may share a common TF component in the same pathway or a TF component cross-talking to other pathways. PMID:15287978

  14. CENTDIST: discovery of co-associated factors by motif distribution

    PubMed Central

    Zhang, Zhizhuo; Chang, Cheng Wei; Goh, Wan Ling; Sung, Wing-Kin; Cheung, Edwin

    2011-01-01

    Transcription factors (TFs) do not function alone but work together with other TFs (called co-TFs) in a combinatorial fashion to precisely control the transcription of target genes. Mining co-TFs is thus important to understand the mechanism of transcriptional regulation. Although existing methods can identify co-TFs, their accuracy depends heavily on the chosen background model and other parameters such as the enrichment window size and the PWM score cut-off. In this study, we have developed a novel web-based co-motif scanning program called CENTDIST (http://compbio.ddns.comp.nus.edu.sg/~chipseq/centdist/). In comparison to current co-motif scanning programs, CENTDIST does not require the input of any user-specific parameters and background information. Instead, CENTDIST automatically determines the best set of parameters and ranks co-TF motifs based on their distribution around ChIP-seq peaks. We tested CENTDIST on 14 ChIP-seq data sets and found CENTDIST is more accurate than existing methods. In particular, we applied CENTDIST on an Androgen Receptor (AR) ChIP-seq data set from a prostate cancer cell line and correctly predicted all known co-TFs (eight TFs) of AR in the top 20 hits as well as discovering AP4 as a novel co-TF of AR (which was missed by existing methods). Taken together, CENTDIST, which exploits the imbalanced nature of co-TF binding, is a user-friendly, parameter-less and powerful predictive web-based program for understanding the mechanism of transcriptional co-regulation. PMID:21602269

  15. Identification of imine reductase-specific sequence motifs.

    PubMed

    Fademrecht, Silvia; Scheller, Philipp N; Nestl, Bettina M; Hauer, Bernhard; Pleiss, Jürgen

    2016-05-01

    Chiral amines are valuable building blocks for the production of a variety of pharmaceuticals, agrochemicals and other specialty chemicals. Only recently, imine reductases (IREDs) were discovered which catalyze the stereoselective reduction of imines to chiral amines. Although several IREDs were biochemically characterized in the last few years, knowledge of the reaction mechanism and the molecular basis of substrate specificity and stereoselectivity is limited. To gain further insights into the sequence-function relationships, the Imine Reductase Engineering Database (www.IRED.BioCatNet.de) was established and a systematic analysis of 530 putative IREDs was performed. A standard numbering scheme based on R-IRED-Sk was introduced to facilitate the identification and communication of structurally equivalent positions in different proteins. A conservation analysis revealed a highly conserved cofactor binding region and a predominantly hydrophobic substrate binding cleft. Two IRED-specific motifs were identified, the cofactor binding motif GLGxMGx(5 )[ATS]x(4) Gx(4) [VIL]WNR[TS]x(2) [KR] and the active site motif Gx[DE]x[GDA]x[APS]x(3){K}x[ASL]x[LMVIAG]. Our results indicate a preference toward NADPH for all IREDs and explain why, despite their sequence similarity to β-hydroxyacid dehydrogenases (β-HADs), no conversion of β-hydroxyacids has been observed. Superfamily-specific conservations were investigated to explore the molecular basis of their stereopreference. Based on our analysis and previous experimental results on IRED mutants, an exclusive role of standard position 187 for stereoselectivity is excluded. Alternatively, two standard positions 139 and 194 were identified which are superfamily-specifically conserved and differ in R- and S-selective enzymes.

  16. Genomic analysis of membrane protein families: abundance and conserved motifs

    PubMed Central

    Liu, Yang; Engelman, Donald M; Gerstein, Mark

    2002-01-01

    Background Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of protein domain families, and using transmembrane-helix prediction and sequence-similarity searching, we identified a total of 526 well-characterized membrane protein families in 26 recently sequenced genomes. To this we added a clustering of a number of predicted but unclassified membrane proteins, resulting in a total of 637 membrane protein families. Results Analysis of the occurrence and composition of these families revealed several interesting trends. The number of assigned membrane protein domains has an approximately linear relationship to the total number of open reading frames (ORFs) in 26 genomes studied. Caenorhabditis elegans is an apparent outlier, because of its high representation of seven-span transmembrane (7-TM) chemoreceptor families. In all genomes, including that of C. elegans, the number of distinct membrane protein families has a logarithmic relation to the number of ORFs. Glycine, proline, and tyrosine locations tend to be conserved in transmembrane regions within families, whereas isoleucine, valine, and methionine locations are relatively mutable. Analysis of motifs in putative transmembrane helices reveals that GxxxG and GxxxxxxG (which can be written GG4 and GG7, respectively; see Materials and methods) are among the most prevalent. This was noted in earlier studies; we now find these motifs are particularly well conserved in families, however, especially those corresponding to transporters, symporters, and channels. Conclusions We carried out a genome-wide analysis on patterns of the classified polytopic membrane protein families and analyzed the distribution of conserved amino acids and motifs in the transmembrane helix regions in these families. PMID:12372142

  17. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    SciTech Connect

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in

  18. Evolving DNA motifs to predict GeneChip probe performance

    PubMed Central

    Langdon, WB; Harrison, AP

    2009-01-01

    Background Affymetrix High Density Oligonuclotide Arrays (HDONA) simultaneously measure expression of thousands of genes using millions of probes. We use correlations between measurements for the same gene across 6685 human tissue samples from NCBI's GEO database to indicated the quality of individual HG-U133A probes. Low correlation indicates a poor probe. Results Regular expressions can be automatically created from a Backus-Naur form (BNF) context-free grammar using strongly typed genetic programming. Conclusion The automatically produced motif is better at predicting poor DNA sequences than an existing human generated RE, suggesting runs of Cytosine and Guanine and mixtures should all be avoided. PMID:19298675

  19. Nucleic Acid i-Motif Structures in Analytical Chemistry.

    PubMed

    Alba, Joan Josep; Sadurní, Anna; Gargallo, Raimundo

    2016-09-02

    Under the appropriate experimental conditions of pH and temperature, cytosine-rich segments in DNA or RNA sequences may produce a characteristic folded structure known as an i-motif. Besides its potential role in vivo, which is still under investigation, this structure has attracted increasing interest in other fields due to its sharp, fast and reversible pH-driven conformational changes. This "on/off" switch at molecular level is being used in nanotechnology and analytical chemistry to develop nanomachines and sensors, respectively. This paper presents a review of the latest applications of this structure in the field of chemical analysis.

  20. Dysprosium-carboxylate nanomeshes with tunable cavity size and assembly motif through ionic interactions.

    PubMed

    Cirera, B; Đorđević, L; Otero, R; Gallego, J M; Bonifazi, D; Miranda, R; Ecija, D

    2016-09-28

    We report the design of dysprosium directed metallo-supramolecular architectures on a pristine Cu(111) surface. By an appropriate selection of the ditopic molecular linkers equipped with terminal carboxylic groups (TPA, PDA and TDA species), we create reticular and mononuclear metal-organic nanomeshes of tunable internodal distance, which are stabilized by eight-fold DyO interactions. A thermal annealing treatment for the reticular Dy:TDA architecture gives rise to an unprecedented quasi-hexagonal nanostructure based on dinuclear Dy clusters, exhibiting a unique six-fold DyO bonding motif. All metallo-supramolecular architectures are stable at room temperature. Our results open new avenues for the engineering of supramolecular architectures on surfaces incorporating f-block elements forming thermally robust nanoarchitectures through ionic bonds.

  1. Fast and Accurate Discovery of Degenerate Linear Motifs in Protein Sequences

    PubMed Central

    Levy, Emmanuel D.; Michnick, Stephen W.

    2014-01-01

    Linear motifs mediate a wide variety of cellular functions, which makes their characterization in protein sequences crucial to understanding cellular systems. However, the short length and degenerate nature of linear motifs make their discovery a difficult problem. Here, we introduce MotifHound, an algorithm particularly suited for the discovery of small and degenerate linear motifs. MotifHound performs an exact and exhaustive enumeration of all motifs present in proteins of interest, including all of their degenerate forms, and scores the overrepresentation of each motif based on its occurrence in proteins of interest relative to a background (e.g., proteome) using the hypergeometric distribution. To assess MotifHound, we benchmarked it together with state-of-the-art algorithms. The benchmark consists of 11,880 sets of proteins from S. cerevisiae; in each set, we artificially spiked-in one motif varying in terms of three key parameters, (i) number of occurrences, (ii) length and (iii) the number of degenerate or “wildcard” positions. The benchmark enabled the evaluation of the impact of these three properties on the performance of the different algorithms. The results showed that MotifHound and SLiMFinder were the most accurate in detecting degenerate linear motifs. Interestingly, MotifHound was 15 to 20 times faster at comparable accuracy and performed best in the discovery of highly degenerate motifs. We complemented the benchmark by an analysis of proteins experimentally shown to bind the FUS1 SH3 domain from S. cerevisiae. Using the full-length protein partners as sole information, MotifHound recapitulated most experimentally determined motifs binding to the FUS1 SH3 domain. Moreover, these motifs exhibited properties typical of SH3 binding peptides, e.g., high intrinsic disorder and evolutionary conservation, despite the fact that none of these properties were used as prior information. MotifHound is available (http://michnick.bcm.umontreal.ca or http

  2. Arsenite Interacts Selectively with Zinc Finger Proteins Containing C3H1 or C4 Motifs*

    PubMed Central

    Zhou, Xixi; Sun, Xi; Cooper, Karen L.; Wang, Feng; Liu, Ke Jian; Hudson, Laurie G.

    2011-01-01

    Arsenic inhibits DNA repair and enhances the genotoxicity of DNA-damaging agents such as benzo[a]pyrene and ultraviolet radiation. Arsenic interaction with DNA repair proteins containing functional zinc finger motifs is one proposed mechanism to account for these observations. Here, we report that arsenite binds to both CCHC DNA-binding zinc fingers of the DNA repair protein PARP-1 (poly(ADP-ribose) polymerase-1). Furthermore, trivalent arsenite coordinated with all three cysteine residues as demonstrated by MS/MS. MALDI-TOF-MS analysis of peptides harboring site-directed substitutions of cysteine with histidine residues within the PARP-1 zinc finger revealed that arsenite bound to peptides containing three or four cysteine residues, but not to peptides with two cysteines, demonstrating arsenite binding selectivity. This finding was not unique to PARP-1; arsenite did not bind to a peptide representing the CCHH zinc finger of the DNA repair protein aprataxin, but did bind to an aprataxin peptide mutated to a CCHC zinc finger. To investigate the impact of arsenite on PARP-1 zinc finger function, we measured the zinc content and DNA-binding capacity of PARP-1 immunoprecipitated from arsenite-exposed cells. PARP-1 zinc content and DNA binding were decreased by 76 and 80%, respectively, compared with protein isolated from untreated cells. We observed comparable decreases in zinc content for XPA (xeroderma pigmentosum group A) protein (CCCC zinc finger), but not SP-1 (specificity protein-1) or aprataxin (CCHH zinc finger). These findings demonstrate that PARP-1 is a direct molecular target of arsenite and that arsenite interacts selectively with zinc finger motifs containing three or more cysteine residues. PMID:21550982

  3. Single-molecule study of thymidine glycol and i-motif through the alpha-hemolysin ion channel

    NASA Astrophysics Data System (ADS)

    He, Lidong

    Nanopore-based devices have emerged as a single-molecule detection and analysis tool for a wide range of applications. Through electrophoretically driving DNA molecules across a nanosized pore, a lot of information can be received, including unfolding kinetics and DNA-protein interactions. This single-molecule method has the potential to sequence kilobase length DNA polymers without amplification or labeling, approaching "the third generation" genome sequencing for around $1000 within 24 hours. alpha-Hemolysin biological nanopores have the advantages of excellent stability, low-noise level, and precise site-directed mutagenesis for engineering this protein nanopore. The first work presented in this thesis established the current signal of the thymidine glycol lesion in DNA oligomers through an immobilization experiment. The thymidine glycol enantiomers were differentiated from each other by different current blockage levels. Also, the effect of bulky hydrophobic adducts to the current blockage was investigated. Secondly, the alpha-hemolysin nanopore was used to study the human telomere i-motif and RET oncogene i-motif at a single-molecule level. In Chapter 3, it was demonstrated that the alpha-hemolysin nanopore can differentiate an i-motif form and single-strand DNA form at different pH values based on the same sequence. In addition, it shows potential to differentiate the folding topologies generated from the same DNA sequence.

  4. A conserved motif in JNK/p38-specific MAPK phosphatases as a determinant for JNK1 recognition and inactivation

    PubMed Central

    Liu, Xin; Zhang, Chen-Song; Lu, Chang; Lin, Sheng-Cai; Wu, Jia-Wei; Wang, Zhi-Xin

    2016-01-01

    Mitogen-activated protein kinases (MAPKs), important in a large array of signalling pathways, are tightly controlled by a cascade of protein kinases and by MAPK phosphatases (MKPs). MAPK signalling efficiency and specificity is modulated by protein–protein interactions between individual MAPKs and the docking motifs in cognate binding partners. Two types of docking interactions have been identified: D-motif-mediated interaction and FXF-docking interaction. Here we report the crystal structure of JNK1 bound to the catalytic domain of MKP7 at 2.4-Å resolution, providing high-resolution structural insight into the FXF-docking interaction. The 285FNFL288 segment in MKP7 directly binds to a hydrophobic site on JNK1 that is near the MAPK insertion and helix αG. Biochemical studies further reveal that this highly conserved structural motif is present in all members of the MKP family, and the interaction mode is universal and critical for the MKP-MAPK recognition and biological function. PMID:26988444

  5. Sarcosine and betaine crystals upon cooling: structural motifs unstable at high pressure become stable at low temperatures.

    PubMed

    Kapustin, E A; Minkov, V S; Boldyreva, E V

    2015-02-07

    The crystal structures of N-methyl derivatives of the simplest amino acid glycine, namely sarcosine (C3H7NO2) and betaine (C5H11NO2), were studied upon cooling by single-crystal X-ray diffraction and single-crystal polarized Raman spectroscopy. The effects of decreasing temperature and increasing hydrostatic pressure on the crystal structures were compared. In particular, we have studied the behavior upon cooling of those structural motifs in the crystals, which are involved in structural rearrangement during pressure-induced phase transitions. In contrast to their high sensitivity to hydrostatic compression, the crystals of both sarcosine and betaine are stable to cooling down to 5 K. Similarly to most α-amino acids, the crystal structures of the two compounds are most rigid upon cooling in the direction of the main structural motif, namely head-to-tail chains (linked via the strongest N-H···O hydrogen bonds and dipole-dipole interactions in the case of sarcosine, or exclusively by dipole-dipole interactions in the case of betaine). The anisotropy of linear strain in betaine does not differ much upon cooling and on hydrostatic compression, whereas this is not the case for sarcosine. Although the interactions between certain structural motifs in sarcosine and betaine weaken as a result of phase transitions induced by pressure, the same interactions strengthen when volume reduction results from cooling.

  6. Structure of the central RNA recognition motif of human TIA-1 at 1.95 A resolution

    SciTech Connect

    Kumar, Amit O.; Swenson, Matthew C.; Benning, Matthew M.; Kielkopf, Clara L.

    2008-03-21

    T-cell-restricted intracellular antigen-1 (TIA-1) regulates alternative pre-mRNA splicing in the nucleus, and mRNA translation in the cytoplasm, by recognizing uridine-rich sequences of RNAs. As a step towards understanding RNA recognition by this regulatory factor, the X-ray structure of the central RNA recognition motif (RRM2) of human TIA-1 is presented at 1.95 A resolution. Comparison with structurally homologous RRM-RNA complexes identifies residues at the RNA interfaces that are conserved in TIA-1-RRM2. The versatile capability of RNP motifs to interact with either proteins or RNA is reinforced by symmetry-related protein-protein interactions mediated by the RNP motifs of TIA-1-RRM2. Importantly, the TIA-1-RRM2 structure reveals the locations of mutations responsible for inhibiting nuclear import. In contrast with previous assumptions, the mutated residues are buried within the hydrophobic interior of the domain, where they would be likely to destabilize the RRM fold rather than directly inhibit RNA binding.

  7. Characterization of the GXXXG motif in the first transmembrane segment of Japanese encephalitis virus precursor membrane (prM) protein.

    PubMed

    Lin, Ying-Ju; Peng, Jia-Guan; Wu, Suh-Chin

    2010-05-24

    The interaction between prM and E proteins in flavivirus-infected cells is a major driving force for the assembly of flavivirus particles. We used site-directed mutagenesis to study the potential role of the transmembrane domains of the prM proteins of Japanese encephalitis virus (JEV) in prM-E heterodimerization as well as subviral particle formation. Alanine insertion scanning mutagenesis within the GXXXG motif in the first transmembrane segment of JEV prM protein affected the prM-E heterodimerization; its specificity was confirmed by replacing the two glycines of the GXXXG motif with alanine, leucine and valine. The GXXXG motif was found to be conserved in the JEV serocomplex viruses but not other flavivirus groups. These mutants with alanine inserted in the two prM transmembrane segments all impaired subviral particle formation in cell cultures. The prM transmembrane domains of JEV may play importation roles in prM-E heterodimerization and viral particle assembly.

  8. SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent

    PubMed Central

    Davey, Norman E.; Shields, Denis C.; Edwards, Richard J.

    2006-01-01

    Many important interactions of proteins are facilitated by short, linear motifs (SLiMs) within a protein's primary sequence. Our aim was to establish robust methods for discovering putative functional motifs. The strongest evidence for such motifs is obtained when the same motifs occur in unrelated proteins, evolving by convergence. In practise, searches for such motifs are often swamped by motifs shared in related proteins that are identical by descent. Prediction of motifs among sets of biologically related proteins, including those both with and without detectable similarity, were made using the TEIRESIAS algorithm. The number of motif occurrences arising through common evolutionary descent were normalized based on treatment of BLAST local alignments. Motifs were ranked according to a score derived from the product of the normalized number of occurrences and the information content. The method was shown to significantly outperform methods that do not discount evolutionary relatedness, when applied to known SLiMs from a subset of the eukaryotic linear motif (ELM) database. An implementation of Multiple Spanning Tree weighting outperformed two other weighting schemes, in a variety of settings. PMID:16855291

  9. Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments

    PubMed Central

    Kheradpour, Pouya; Kellis, Manolis

    2014-01-01

    Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity between data sets and measure the variation of motif binding across individuals and species. PMID:24335146

  10. Small yet effective: the ethylene responsive element binding factor-associated amphiphilic repression (EAR) motif.

    PubMed

    Kagale, Sateesh; Rozwadowski, Kevin

    2010-06-01

    The Ethylene-responsive element binding factor-associated Amphiphilic Repression (EAR) motif is a small yet distinct regulatory motif that is conserved in many plant transcriptional regulator (TR) proteins associated with diverse biological functions. We have previously established a list of high-confidence Arabidopsis EAR repressors, the EAR repressome, comprising 219 TRs belonging to 21 different TR families. This class of proteins and the sequence context of the EAR motif exhibited a high degree of conservation across evolutionarily diverse plant species. Our comprehensive genome-wide analysis enabled refining EAR motifs as comprising either LxLxL or DLNxxP. Comparing the representation of these sequence signatures in TRs to that of other repressor motifs we show that the EAR motif is the one most frequently represented, detected in 10 to 25% of the TRs from diverse plant species. The mechanisms involved in regulation of EAR motif function and the cellular fates of EAR repressors are currently not well understood. Our earlier analysis had implicated amino acid residues flanking the EAR motifs in regulation of their functionality. Here, we present additional evidence supporting possible regulation of EAR motif function by phosphorylation of integral or adjacent Ser and/or Thr residues. Additionally, we discuss potential novel roles of EAR motifs in plant-pathogen interaction and processes other than transcriptional repression.

  11. Powerful Identification of Cis-regulatory SNPs in Human Primary Monocytes Using Allele-Specific Gene Expression

    PubMed Central

    Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Maouche, Seraya; Göring, Harald H. H.; Liljedahl, Ulrika; Enström, Camilla; Brocheton, Jessy; Proust, Carole; Godefroy, Tiphaine; Sambrook, Jennifer G.; Jolley, Jennifer; Crisp-Hihn, Abigail; Foad, Nicola; Lloyd-Jones, Heather; Stephens, Jonathan; Gwilliam, Rhian; Rice, Catherine M.; Hengstenberg, Christian; Samani, Nilesh J.; Erdmann, Jeanette; Schunkert, Heribert; Pastinen, Tomi; Deloukas, Panos; Goodall, Alison H.; Ouwehand, Willem H.; Cambien, François; Syvänen, Ann-Christine

    2012-01-01

    A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers. PMID:23300628

  12. Cis-Regulatory Control of the Nuclear Receptor Coup-TF Gene in the Sea Urchin Paracentrotus lividus Embryo

    PubMed Central

    Kalampoki, Lamprini G.; Flytzanis, Constantin N.

    2014-01-01

    Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN). The Paracentrotus lividus Coup-TF gene (PlCoup-TF) is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5′ deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (−532 to −232), was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5′ and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3) within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1) or ectopic expression (RE2, RE3). It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN. PMID:25386650

  13. Cis-regulatory control of the nuclear receptor Coup-TF gene in the sea urchin Paracentrotus lividus embryo.

    PubMed

    Kalampoki, Lamprini G; Flytzanis, Constantin N

    2014-01-01

    Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN). The Paracentrotus lividus Coup-TF gene (PlCoup-TF) is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5' deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (-532 to -232), was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5' and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3) within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1) or ectopic expression (RE2, RE3). It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN.

  14. Detection and Visualization of Compositionally Similar cis-Regulatory Element Clusters in Orthologous and Coordinately Controlled Genes

    PubMed Central

    Jegga, Anil G.; Sherwood, Shawn P.; Carman, James W.; Pinski, Andrew T.; Phillips, Jerry L.; Pestian, John P.; Aronow, Bruce J.

    2002-01-01

    Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions. PMID:12213778

  15. Detection and visualization of compositionally similar cis-regulatory element clusters in orthologous and coordinately controlled genes.

    PubMed

    Jegga, Anil G; Sherwood, Shawn P; Carman, James W; Pinski, Andrew T; Phillips, Jerry L; Pestian, John P; Aronow, Bruce J

    2002-09-01

    Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clus