cis-regulatory motif directs: Topics by Science.gov

Sample records for cis-regulatory motif directs

A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

DOE PAGES

Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

2015-03-22

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements.

PubMed

Henry, Kelli F; Kawashima, Tomokazu; Goldberg, Robert B

2015-06-01

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globular-stage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. A homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.
Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

PubMed

Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

2013-09-02

In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

PubMed

Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

2016-08-09

Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance
On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

NASA Astrophysics Data System (ADS)

Tarpine, Ryan; Istrail, Sorin

The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

PubMed Central

Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

2014-01-01

Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by
Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus

PubMed Central

Sundaram, Vasavi; Choudhary, Mayank N. K.; Pehrsson, Erica; Xing, Xiaoyun; Fiore, Christopher; Pandey, Manishi; Maricque, Brett; Udawatta, Methma; Ngo, Duc; Chen, Yujie; Paguntalan, Asia; Ray, Tammy; Hughes, Ava; Cohen, Barak A.; Wang, Ting

2017-01-01

Cis-regulatory modules contain multiple transcription factor (TF)-binding sites and integrate the effects of each TF to control gene expression in specific cellular contexts. Transposable elements (TEs) are uniquely equipped to deposit their regulatory sequences across a genome, which could also contain cis-regulatory modules that coordinate the control of multiple genes with the same regulatory logic. We provide the first evidence of mouse-specific TEs that encode a module of TF-binding sites in mouse embryonic stem cells (ESCs). The majority (77%) of the individual TEs tested exhibited enhancer activity in mouse ESCs. By mutating individual TF-binding sites within the TE, we identified a module of TF-binding motifs that cooperatively enhanced gene expression. Interestingly, we also observed the same motif module in the in silico constructed ancestral TE that also acted cooperatively to enhance gene expression. Our results suggest that ancestral TE insertions might have brought in cis-regulatory modules into the mouse genome. PMID:28348391
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

PubMed

Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2014-02-17

As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms

PubMed Central

2014-01-01

Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that
Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

PubMed Central

Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

2016-01-01

Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282
Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation.

PubMed

Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P M; Zhu, Xin-Guang

2016-09-01

Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5'UTR, 3'UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5'UTR, 3'UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Redundant CArG Box Cis-motif Activity Mediates SHATTERPROOF2 Transcriptional Regulation during Arabidopsis thaliana Gynoecium Development

PubMed Central

Sehra, Bhupinder; Franks, Robert G.

2017-01-01

In the Arabidopsis thaliana seed pod, pod shatter and seed dispersal properties are in part determined by the development of a longitudinally orientated dehiscence zone (DZ) that derives from cells of the gynoecial valve margin (VM). Transcriptional regulation of the MADS protein encoding transcription factors genes SHATTERPROOF1 (SHP1) and SHATTERPROOF2 (SHP2) are critical for proper VM identity specification and later on for DZ development. Current models of SHP1 and SHP2 regulation indicate that the transcription factors FRUITFULL (FUL) and REPLUMLESS (RPL) repress these SHP genes in the developing valve and replum domains, respectively. Thus the expression of the SHP genes is restricted to the VM. FUL encodes a MADS-box containing transcription factor that is predicted to act through CArG-box containing cis-regulatory motifs. Here we delimit functional modules within the SHP2 cis-regulatory region and examine the functional importance of CArG box motifs within these regulatory regions. We have characterized a 2.2kb region upstream of the SHP2 translation start site that drives early and late medial domain expression in the gynoecium, as well as expression within the VM and DZ. We identified two separable, independent cis-regulatory modules, a 1kb promoter region and a 700bp enhancer region, that are capable of giving VM and DZ expression. Our results argue for multiple independent cis-regulatory modules that support SHP2 expression during VM development and may contribute to the robustness of SHP2 expression in this tissue. Additionally, three closely positioned CArG box motifs located in the SHP2 upstream regulatory region were mutated in the context of the 2.2kb reporter construct. Mutating simultaneously all three CArG boxes caused a moderate de-repression of the SHP2 reporter that was detected within the valve domain, suggesting that these CArG boxes are involved in SHP2 repression in the valve. PMID:29085379
A cis-regulatory logic simulator.

PubMed

Zeigler, Robert D; Gertz, Jason; Cohen, Barak A

2007-07-27

A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence. We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence. We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated
info-gibbs: a motif discovery algorithm that directly optimizes information content during sampling.

PubMed

Defrance, Matthieu; van Helden, Jacques

2009-10-15

Discovering cis-regulatory elements in genome sequence remains a challenging issue. Several methods rely on the optimization of some target scoring function. The information content (IC) or relative entropy of the motif has proven to be a good estimator of transcription factor DNA binding affinity. However, these information-based metrics are usually used as a posteriori statistics rather than during the motif search process itself. We introduce here info-gibbs, a Gibbs sampling algorithm that efficiently optimizes the IC or the log-likelihood ratio (LLR) of the motif while keeping computation time low. The method compares well with existing methods like MEME, BioProspector, Gibbs or GAME on both synthetic and biological datasets. Our study shows that motif discovery techniques can be enhanced by directly focusing the search on the motif IC or the motif LLR. http://rsat.ulb.ac.be/rsat/info-gibbs
Structural complexity of Dengue virus untranslated regions: cis-acting RNA motifs and pseudoknot interactions modulating functionality of the viral genome

PubMed Central

Sztuba-Solinska, Joanna; Teramoto, Tadahisa; Rausch, Jason W.; Shapiro, Bruce A.; Padmanabhan, Radhakrishnan; Le Grice, Stuart F. J.

2013-01-01

The Dengue virus (DENV) genome contains multiple cis-acting elements required for translation and replication. Previous studies indicated that a 719-nt subgenomic minigenome (DENV-MINI) is an efficient template for translation and (−) strand RNA synthesis in vitro. We performed a detailed structural analysis of DENV-MINI RNA, combining chemical acylation techniques, Pb2+ ion-induced hydrolysis and site-directed mutagenesis. Our results highlight protein-independent 5′–3′ terminal interactions involving hybridization between recognized cis-acting motifs. Probing analyses identified tandem dumbbell structures (DBs) within the 3′ terminus spaced by single-stranded regions, internal loops and hairpins with embedded GNRA-like motifs. Analysis of conserved motifs and top loops (TLs) of these dumbbells, and their proposed interactions with downstream pseudoknot (PK) regions, predicted an H-type pseudoknot involving TL1 of the 5′ DB and the complementary region, PK2. As disrupting the TL1/PK2 interaction, via ‘flipping’ mutations of PK2, previously attenuated DENV replication, this pseudoknot may participate in regulation of RNA synthesis. Computer modeling implied that this motif might function as autonomous structural/regulatory element. In addition, our studies targeting elements of the 3′ DB and its complementary region PK1 indicated that communication between 5′–3′ terminal regions strongly depends on structure and sequence composition of the 5′ cyclization region. PMID:23531545
Evolution of New cis-Regulatory Motifs Required for Cell-Specific Gene Expression in Caenorhabditis

PubMed Central

Félix, Marie-Anne

2016-01-01

Patterning of C. elegans vulval cell fates relies on inductive signaling. In this induction event, a single cell, the gonadal anchor cell, secretes LIN-3/EGF and induces three out of six competent precursor cells to acquire a vulval fate. We previously showed that this developmental system is robust to a four-fold variation in lin-3/EGF genetic dose. Here using single-molecule FISH, we find that the mean level of expression of lin-3 in the anchor cell is remarkably conserved. No change in lin-3 expression level could be detected among C. elegans wild isolates and only a low level of change—less than 30%—in the Caenorhabditis genus and in Oscheius tipulae. In C. elegans, lin-3 expression in the anchor cell is known to require three transcription factor binding sites, specifically two E-boxes and a nuclear-hormone-receptor (NHR) binding site. Mutation of any of these three elements in C. elegans results in a dramatic decrease in lin-3 expression. Yet only a single E-box is found in the Drosophilae supergroup of Caenorhabditis species, including C. angaria, while the NHR-binding site likely only evolved at the base of the Elegans group. We find that a transgene from C. angaria bearing a single E-box is sufficient for normal expression in C. elegans. Even a short 58 bp cis-regulatory fragment from C. angaria with this single E-box is able to replace the three transcription factor binding sites at the endogenous C. elegans lin-3 locus, resulting in the wild-type expression level. Thus, regulatory evolution occurring in cis within a 58 bp lin-3 fragment, results in a strict requirement for the NHR binding site and a second E-box in C. elegans. This single-cell, single-molecule, quantitative and functional evo-devo study demonstrates that conserved expression levels can hide extensive change in cis-regulatory site requirements and highlights the evolution of new cis-regulatory elements required for cell-specific gene expression. PMID:27588814
In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana.

PubMed

Ibraheem, Omodele; Botha, Christiaan E J; Bradley, Graeme

2010-12-01

The regulation of gene expression involves a multifarious regulatory system. Each gene contains a unique combination of cis-acting regulatory sequence elements in the 5' regulatory region that determines its temporal and spatial expression. Cis-acting regulatory elements are essential transcriptional gene regulatory units; they control many biological processes and stress responses. Thus a full understanding of the transcriptional gene regulation system will depend on successful functional analyses of cis-acting elements. Cis-acting regulatory elements present within the 5' regulatory region of the sucrose transporter gene families in rice (Oryza sativa Japonica cultivar-group) and Arabidopsis thaliana, were identified using a bioinformatics approach. The possible cis-acting regulatory elements were predicted by scanning 1.5kbp of 5' regulatory regions of the sucrose transporter genes translational start sites, using Plant CARE, PLACE and Genomatix Matinspector professional databases. Several cis-acting regulatory elements that are associated with plant development, plant hormonal regulation and stress response were identified, and were present in varying frequencies within the 1.5kbp of 5' regulatory region, among which are; A-box, RY, CAT, Pyrimidine-box, Sucrose-box, ABRE, ARF, ERE, GARE, Me-JA, ARE, DRE, GA-motif, GATA, GT-1, MYC, MYB, W-box, and I-box. This result reveals the probable cis-acting regulatory elements that possibly are involved in the expression and regulation of sucrose transporter gene families in rice and Arabidopsis thaliana during cellular development or environmental stress conditions. Copyright © 2010 Elsevier Ltd. All rights reserved.
PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

PubMed

Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

2013-02-01

Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
Evaluation of phylogenetic footprint discovery for predicting bacterial cis-regulatory elements and revealing their evolution.

PubMed

Janky, Rekin's; van Helden, Jacques

2008-01-23

The detection of conserved motifs in promoters of orthologous genes (phylogenetic footprints) has become a common strategy to predict cis-acting regulatory elements. Several software tools are routinely used to raise hypotheses about regulation. However, these tools are generally used as black boxes, with default parameters. A systematic evaluation of optimal parameters for a footprint discovery strategy can bring a sizeable improvement to the predictions. We evaluate the performances of a footprint discovery approach based on the detection of over-represented spaced motifs. This method is particularly suitable for (but not restricted to) Bacteria, since such motifs are typically bound by factors containing a Helix-Turn-Helix domain. We evaluated footprint discovery in 368 Escherichia coli K12 genes with annotated sites, under 40 different combinations of parameters (taxonomical level, background model, organism-specific filtering, operon inference). Motifs are assessed both at the levels of correctness and significance. We further report a detailed analysis of 181 bacterial orthologs of the LexA repressor. Distinct motifs are detected at various taxonomical levels, including the 7 previously characterized taxon-specific motifs. In addition, we highlight a significantly stronger conservation of half-motifs in Actinobacteria, relative to Firmicutes, suggesting an intermediate state in specificity switching between the two Gram-positive phyla, and thereby revealing the on-going evolution of LexA auto-regulation. The footprint discovery method proposed here shows excellent results with E. coli and can readily be extended to predict cis-acting regulatory signals and propose testable hypotheses in bacterial genomes for which nothing is known about regulation.

Cis-regulatory Elements and Human Evolution

PubMed Central

Siepel, Adam

2014-01-01

Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider “new frontiers” in this field stemming from recent research on transcriptional regulation. PMID:25218861
Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

PubMed

Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

2005-09-01

We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

PubMed Central

2017-01-01

The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them. PMID:28497059
Phosphorylation effects on cis/trans isomerization and the backbone conformation of serine-proline motifs: accelerated molecular dynamics analysis.

PubMed

Hamelberg, Donald; Shen, Tongye; McCammon, J Andrew

2005-02-16

The presence of serine/threonine-proline motifs in proteins provides a conformational switching mechanism of the backbone through the cis/trans isomerization of the peptidyl-prolyl (omega) bond. The reversible phosphorylation of the serine/threonine modulates this switching in regulatory proteins to alter signaling and transcription. However, the mechanism is not well understood. This is partly because cis/trans isomerization is a very slow process and, hence, difficult to study. We have used our accelerated molecular dynamics method to study the cis/trans proline isomerization, preferred backbone conformation of a serine-proline motif, and the effects of phosphorylation of the serine residue. We demonstrate that, unlike normal molecular dynamics, the accelerated molecular dynamics allows for the system to escape very easily from the trans isomer to cis isomer, and vice versa. Moreover, for both the unphosphorylated and phosphorylated peptides, the statistical thermodynamic properties are recaptured, and the results are consistent with experimental values. Isomerization of the proline omega bond is shown to be asymmetric and strongly dependent on the psi backbone angle before and after phosphorylation. The rates of escape decrease after phosphorylation. Also, the alpha-helical backbone conformation is more favored after phosphorylation. This accelerated molecular dynamics approach provides a general approach for enhancing the conformational transitions of molecular systems without having prior knowledge of the location of the minima and barriers on the potential-energy landscape.
A cis-regulatory sequence driving metabolic insecticide resistance in mosquitoes: functional characterisation and signatures of selection.

PubMed

Wilding, Craig S; Smith, Ian; Lynd, Amy; Yawson, Alexander Egyir; Weetman, David; Paine, Mark J I; Donnelly, Martin J

2012-09-01

Although cytochrome P450 (CYP450) enzymes are frequently up-regulated in mosquitoes resistant to insecticides, no regulatory motifs driving these expression differences with relevance to wild populations have been identified. Transposable elements (TEs) are often enriched upstream of those CYP450s involved in insecticide resistance, leading to the assumption that they contribute regulatory motifs that directly underlie the resistance phenotype. A partial CuRE1 (Culex Repetitive Element 1) transposable element is found directly upstream of CYP9M10, a cytochrome P450 implicated previously in larval resistance to permethrin in the ISOP450 strain of Culex quinquefasciatus, but is absent from the equivalent genomic region of a susceptible strain. Via expression of CYP9M10 in Escherichia coli we have now demonstrated time- and NADPH-dependant permethrin metabolism, prerequisites for confirmation of a role in metabolic resistance, and through qPCR shown that CYP9M10 is >20-fold over-expressed in ISOP450 compared to a susceptible strain. In a fluorescent reporter assay the region upstream of CYP9M10 from ISOP450 drove 10× expression compared to the equivalent region (lacking CuRE1) from the susceptible strain. Close correspondence with the gene expression fold-change implicates the upstream region including CuRE1 as a cis-regulatory element involved in resistance. Only a single CuRE1 bearing allele, identical to the CuRE1 bearing allele in the resistant strain, is found throughout Sub-Saharan Africa, in contrast to the diversity encountered in non-CuRE1 alleles. This suggests a single origin and subsequent spread due to selective advantage. CuRE1 is detectable using a simple diagnostic. When applied to C. quinquefasciatus larvae from Ghana we have demonstrated a significant association with permethrin resistance in multiple field sites (mean Odds Ratio = 3.86) suggesting this marker has relevance to natural populations of vector mosquitoes. However, when CuRE1 was excised
Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

PubMed Central

Fauteux, François; Strömvik, Martina V

2009-01-01

Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs
Diverse Cis-Regulatory Mechanisms Contribute to Expression Evolution of Tandem Gene Duplicates

PubMed Central

Baudouin-Gonzalez, Luís; Santos, Marília A; Tempesta, Camille; Sucena, Élio; Roch, Fernando; Tanaka, Kohtaro

2017-01-01

Abstract Pairs of duplicated genes generally display a combination of conserved expression patterns inherited from their unduplicated ancestor and newly acquired domains. However, how the cis-regulatory architecture of duplicated loci evolves to produce these expression patterns is poorly understood. We have directly examined the gene-regulatory evolution of two tandem duplicates, the Drosophila Ly6 genes CG9336 and CG9338, which arose at the base of the drosophilids between 40 and 60 Ma. Comparing the expression patterns of the two paralogs in four Drosophila species with that of the unduplicated ortholog in the tephritid Ceratitis capitata, we show that they diverged from each other as well as from the unduplicated ortholog. Moreover, the expression divergence appears to have occurred close to the duplication event and also more recently in a lineage-specific manner. The comparison of the tissue-specific cis-regulatory modules (CRMs) controlling the paralog expression in the four Drosophila species indicates that diverse cis-regulatory mechanisms, including the novel tissue-specific enhancers, differential inactivation, and enhancer sharing, contributed to the expression evolution. Our analysis also reveals a surprisingly variable cis-regulatory architecture, in which the CRMs driving conserved expression domains change in number, location, and specificity. Altogether, this study provides a detailed historical account that uncovers a highly dynamic picture of how the paralog expression patterns and their underlying cis-regulatory landscape evolve. We argue that our findings will encourage studying cis-regulatory evolution at the whole-locus level to understand how interactions between enhancers and other regulatory levels shape the evolution of gene expression. PMID:28961967
Modeling gene regulatory network motifs using statecharts

PubMed Central

2012-01-01

Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967
Identification, occurrence, and validation of DRE and ABRE Cis-regulatory motifs in the promoter regions of genes of Arabidopsis thaliana.

PubMed

Mishra, Sonal; Shukla, Aparna; Upadhyay, Swati; Sanchita; Sharma, Pooja; Singh, Seema; Phukan, Ujjal J; Meena, Abha; Khan, Feroz; Tripathi, Vineeta; Shukla, Rakesh Kumar; Shrama, Ashok

2014-04-01

Plants posses a complex co-regulatory network which helps them to elicit a response under diverse adverse conditions. We used an in silico approach to identify the genes with both DRE and ABRE motifs in their promoter regions in Arabidopsis thaliana. Our results showed that Arabidopsis contains a set of 2,052 genes with ABRE and DRE motifs in their promoter regions. Approximately 72% or more of the total predicted 2,052 genes had a gap distance of less than 400 bp between DRE and ABRE motifs. For positional orientation of the DRE and ABRE motifs, we found that the DR form (one in direct and the other one in reverse orientation) was more prevalent than other forms. These predicted 2,052 genes include 155 transcription factors. Using microarray data from The Arabidopsis Information Resource (TAIR) database, we present 44 transcription factors out of 155 which are upregulated by more than twofold in response to osmotic stress and ABA treatment. Fifty-one transcripts from the one predicted above were validated using semiquantitative expression analysis to support the microarray data in TAIR. Taken together, we report a set of genes containing both DRE and ABRE motifs in their promoter regions in A. thaliana, which can be useful to understand the role of ABA under osmotic stress condition. © 2013 Institute of Botany, Chinese Academy of Sciences.
Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

PubMed

Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

2009-02-01

Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.
Deciphering the transcriptional cis-regulatory code.

PubMed

Yáñez-Cuna, J Omar; Kvon, Evgeny Z; Stark, Alexander

2013-01-01

Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code. Copyright © 2012 Elsevier Ltd. All rights reserved.
Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

PubMed Central

Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

2011-01-01

We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875
Promoter analysis reveals cis-regulatory motifs associated with the expression of the WRKY transcription factor CrWRKY1 in Catharanthus roseus.

PubMed

Yang, Zhirong; Patra, Barunava; Li, Runzhi; Pattanaik, Sitakanta; Yuan, Ling

2013-12-01

WRKY transcription factors (TFs) are emerging as an important group of regulators of plant secondary metabolism. However, the cis-regulatory elements associated with their regulation have not been well characterized. We have previously demonstrated that CrWRKY1, a member of subgroup III of the WRKY TF family, regulates biosynthesis of terpenoid indole alkaloids in the ornamental and medicinal plant, Catharanthus roseus. Here, we report the isolation and functional characterization of the CrWRKY1 promoter. In silico analysis of the promoter sequence reveals the presence of several potential TF binding motifs, indicating the involvement of additional TFs in the regulation of the TIA pathway. The CrWRKY1 promoter can drive the expression of a β-glucuronidase (GUS) reporter gene in native (C. roseus protoplasts and transgenic hairy roots) and heterologous (transgenic tobacco seedlings) systems. Analysis of 5'- or 3'-end deletions indicates that the sequence located between positions -140 to -93 bp and -3 to +113 bp, relative to the transcription start site, is critical for promoter activity. Mutation analysis shows that two overlapping as-1 elements and a CT-rich motif contribute significantly to promoter activity. The CrWRKY1 promoter is induced in response to methyl jasmonate (MJ) treatment and the promoter region between -230 and -93 bp contains a putative MJ-responsive element. The CrWRKY1 promoter can potentially be used as a tool to isolate novel TFs involved in the regulation of the TIA pathway.
Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora

PubMed Central

Steige, Kim A.; Laenen, Benjamin; Reimegård, Johan; Slotte, Tanja

2017-01-01

Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora. We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species. PMID:28096395
Direct AUC optimization of regulatory motifs.

PubMed

Zhu, Lin; Zhang, Hong-Bo; Huang, De-Shuang

2017-07-15

The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. We propose a novel algorithm called CDAUC for optimizing DML-learned motifs based on the area under the receiver-operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub-problem is a piece-wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high-throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities. CDAUC is available at: https://drive.google.com/drive/folders/0BxOW5MtIZbJjNFpCeHlBVWJHeW8 . dshuang@tongji.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Functional Evolution of a cis-Regulatory Module

PubMed Central

Palsson, Arnar; Alekseeva, Elena; Bergman, Casey M; Nathan, Janaki; Kreitman, Martin

2005-01-01

Lack of knowledge about how regulatory regions evolve in relation to their structure–function may limit the utility of comparative sequence analysis in deciphering cis-regulatory sequences. To address this we applied reverse genetics to carry out a functional genetic complementation analysis of a eukaryotic cis-regulatory module—the even-skipped stripe 2 enhancer—from four Drosophila species. The evolution of this enhancer is non-clock-like, with important functional differences between closely related species and functional convergence between distantly related species. Functional divergence is attributable to differences in activation levels rather than spatiotemporal control of gene expression. Our findings have implications for understanding enhancer structure–function, mechanisms of speciation and computational identification of regulatory modules. PMID:15757364
NAC transcription factor genes: genome-wide identification, phylogenetic, motif and cis-regulatory element analysis in pigeonpea (Cajanus cajan (L.) Millsp.).

PubMed

Satheesh, Viswanathan; Jagannadham, P Tej Kumar; Chidambaranathan, Parameswaran; Jain, P K; Srinivasan, R

2014-12-01

The NAC (NAM, ATAF and CUC) proteins are plant-specific transcription factors implicated in development and stress responses. In the present study 88 pigeonpea NAC genes were identified from the recently published draft genome of pigeonpea by using homology based and de novo prediction programmes. These sequences were further subjected to phylogenetic, motif and promoter analyses. In motif analysis, highly conserved motifs were identified in the NAC domain and also in the C-terminal region of the NAC proteins. A phylogenetic reconstruction using pigeonpea, Arabidopsis and soybean NAC genes revealed 33 putative stress-responsive pigeonpea NAC genes. Several stress-responsive cis-elements were identified through in silico analysis of the promoters of these putative stress-responsive genes. This analysis is the first report of NAC gene family in pigeonpea and will be useful for the identification and selection of candidate genes associated with stress tolerance.
DMINDA: an integrated web server for DNA motif identification and analyses

PubMed Central

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-01-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Decoding a Signature-Based Model of Transcription Cofactor Recruitment Dictated by Cardinal Cis-Regulatory Elements in Proximal Promoter Regions

PubMed Central

Benner, Christopher; Hutt, Kasey R.; Stunnenberg, Rieka; Garcia-Bassets, Ivan

2013-01-01

Genome-wide maps of DNase I hypersensitive sites (DHSs) reveal that most human promoters contain perpetually active cis-regulatory elements between −150 bp and +50 bp (−150/+50 bp) relative to the transcription start site (TSS). Transcription factors (TFs) recruit cofactors (chromatin remodelers, histone/protein-modifying enzymes, and scaffold proteins) to these elements in order to organize the local chromatin structure and coordinate the balance of post-translational modifications nearby, contributing to the overall regulation of transcription. However, the rules of TF-mediated cofactor recruitment to the −150/+50 bp promoter regions remain poorly understood. Here, we provide evidence for a general model in which a series of cis-regulatory elements (here termed ‘cardinal’ motifs) prefer acting individually, rather than in fixed combinations, within the −150/+50 bp regions to recruit TFs that dictate cofactor signatures distinctive of specific promoter subsets. Subsequently, human promoters can be subclassified based on the presence of cardinal elements and their associated cofactor signatures. In this study, furthermore, we have focused on promoters containing the nuclear respiratory factor 1 (NRF1) motif as the cardinal cis-regulatory element and have identified the pervasive association of NRF1 with the cofactor lysine-specific demethylase 1 (LSD1/KDM1A). This signature might be distinctive of promoters regulating nuclear-encoded mitochondrial and other particular genes in at least some cells. Together, we propose that decoding a signature-based, expanded model of control at proximal promoter regions should lead to a better understanding of coordinated regulation of gene transcription. PMID:24244184
Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework.

PubMed

Maeso, Ignacio; Tena, Juan J

2016-09-01

Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements. Copyright © 2015 Elsevier Ltd. All rights reserved.

Abundant raw material for cis-regulatory evolution in humans

NASA Technical Reports Server (NTRS)

Rockman, Matthew V.; Wray, Gregory A.

2002-01-01

Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.
Direct activation of a notochord cis-regulatory module by Brachyury and FoxA in the ascidian Ciona intestinalis.

PubMed

Passamaneck, Yale J; Katikala, Lavanya; Perrone, Lorena; Dunn, Matthew P; Oda-Ishii, Izumi; Di Gregorio, Anna

2009-11-01

The notochord is a defining feature of the chordate body plan. Experiments in ascidian, frog and mouse embryos have shown that co-expression of Brachyury and FoxA class transcription factors is required for notochord development. However, studies on the cis-regulatory sequences mediating the synergistic effects of these transcription factors are complicated by the limited knowledge of notochord genes and cis-regulatory modules (CRMs) that are directly targeted by both. We have identified an easily testable model for such investigations in a 155-bp notochord-specific CRM from the ascidian Ciona intestinalis. This CRM contains functional binding sites for both Ciona Brachyury (Ci-Bra) and FoxA (Ci-FoxA-a). By combining point mutation analysis and misexpression experiments, we demonstrate that binding of both transcription factors to this CRM is necessary and sufficient to activate transcription. To gain insights into the cis-regulatory criteria controlling its activity, we investigated the organization of the transcription factor binding sites within the 155-bp CRM. The 155-bp sequence contains two Ci-Bra binding sites with identical core sequences but opposite orientations, only one of which is required for enhancer activity. Changes in both orientation and spacing of these sites substantially affect the activity of the CRM, as clusters of identical sites found in the Ciona genome with different arrangements are unable to activate transcription in notochord cells. This work presents the first evidence of a synergistic interaction between Brachyury and FoxA in the activation of an individual notochord CRM, and highlights the importance of transcription factor binding site arrangement for its function.
DMINDA: an integrated web server for DNA motif identification and analyses.

PubMed

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-07-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.

PubMed

Vierstra, Jeff; Rynes, Eric; Sandstrom, Richard; Zhang, Miaohua; Canfield, Theresa; Hansen, R Scott; Stehling-Sun, Sandra; Sabo, Peter J; Byron, Rachel; Humbert, Richard; Thurman, Robert E; Johnson, Audra K; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Giste, Erika; Haugen, Eric; Dunn, Douglas; Wilken, Matthew S; Josefowicz, Steven; Samstein, Robert; Chang, Kai-Hsin; Eichler, Evan E; De Bruijn, Marella; Reh, Thomas A; Skoultchi, Arthur; Rudensky, Alexander; Orkin, Stuart H; Papayannopoulou, Thalia; Treuting, Piper M; Selleri, Licia; Kaul, Rajinder; Groudine, Mark; Bender, M A; Stamatoyannopoulos, John A

2014-11-21

To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes. Copyright © 2014, American Association for the Advancement of Science.
Cis-regulatory Evolution of Chalcone-Synthase Expression in the Genus Arabidopsis

PubMed Central

de Meaux, Juliette; Pop, A.; Mitchell-Olds, T.

2006-01-01

The contribution of cis-regulation to adaptive evolutionary change is believed to be essential, yet little is known about the evolutionary rules that govern regulatory sequences. Here, we characterize the short-term evolutionary dynamics of a cis-regulatory region within and among two closely related species, A. lyrata and A. halleri, and compare our findings to A. thaliana. We focused on the cis-regulatory region of chalcone synthase (CHS), a key enzyme involved in the synthesis of plant secondary metabolites. We observed patterns of nucleotide diversity that differ among species but do not depart from neutral expectations. Using intra- and interspecific F1 progeny, we have evaluated functional cis-regulatory variation in response to light and herbivory, environmental cues, which are known to induce CHS expression. We find that substantial cis-regulatory variation segregates within and among populations as well as between species, some of which results from interspecific genetic introgression. We further demonstrate that, in A. thaliana, CHS cis-regulation in response to herbivory is greater than in A. lyrata or A. halleri. Our work indicates that the evolutionary dynamics of a cis-regulatory region is characterized by pervasive functional variation, achieved mostly by modification of response modules to one but not all environmental cues. Our study did not detect the footprint of selection on this variation. PMID:17028316
Repeated cis-regulatory tuning of a metabolic bottleneck gene during evolution.

PubMed

Kuang, Meihua Christina; Kominek, Jacek; Alexander, William G; Cheng, Jan-Fang; Wrobel, Russell L; Hittinger, Chris Todd

2018-05-21

Repeated evolutionary events imply underlying genetic constraints that can make evolutionary mechanisms predictable. Morphological traits are thought to evolve frequently through cis-regulatory changes because these mechanisms bypass constraints in pleiotropic genes that are reused during development. In contrast, the constraints acting on metabolic traits during evolution are less well studied. Here we show how a metabolic bottleneck gene has repeatedly adopted similar cis-regulatory solutions during evolution, likely due to its pleiotropic role integrating flux from multiple metabolic pathways. Specifically, the genes encoding phosphoglucomutase activity (PGM1/PGM2), which connect GALactose catabolism to glycolysis, have gained and lost direct regulation by the transcription factor Gal4 several times during yeast evolution. Through targeted mutations of predicted Gal4-binding sites in yeast genomes, we show this galactose-mediated regulation of PGM1/2 supports vigorous growth on galactose in multiple yeast species, including Saccharomyces uvarum and Lachancea kluyveri. Furthermore, the addition of galactose-inducible PGM1 alone is sufficient to improve the growth on galactose of multiple species that lack this regulation, including Saccharomyces cerevisiae. The strong association between regulation of PGM1/2 by Gal4 even enables remarkably accurate predictions of galactose growth phenotypes between closely related species. This repeated mode of evolution suggests that this specific cis-regulatory connection is a common way that diverse yeasts can govern flux through the pathway, likely due to the constraints imposed by this pleiotropic bottleneck gene. Since metabolic pathways are highly interconnected, we argue that cis-regulatory evolution might be widespread at pleiotropic genes that control metabolic bottlenecks and intersections.
Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

2003-06-01

OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less
Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

PubMed

Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

2015-04-23

With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.
MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

PubMed

Ozaki, Haruka; Iwasaki, Wataru

2016-08-01

As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

NASA Astrophysics Data System (ADS)

Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

2018-04-01

The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.
cisMEP: an integrated repository of genomic epigenetic profiles and cis-regulatory modules in Drosophila

PubMed Central

2014-01-01

Background Cis-regulatory modules (CRMs), or the DNA sequences required for regulating gene expression, play the central role in biological researches on transcriptional regulation in metazoan species. Nowadays, the systematic understanding of CRMs still mainly resorts to computational methods due to the time-consuming and small-scale nature of experimental methods. But the accuracy and reliability of different CRM prediction tools are still unclear. Without comparative cross-analysis of the results and combinatorial consideration with extra experimental information, there is no easy way to assess the confidence of the predicted CRMs. This limits the genome-wide understanding of CRMs. Description It is known that transcription factor binding and epigenetic profiles tend to determine functions of CRMs in gene transcriptional regulation. Thus integration of the genome-wide epigenetic profiles with systematically predicted CRMs can greatly help researchers evaluate and decipher the prediction confidence and possible transcriptional regulatory functions of these potential CRMs. However, these data are still fragmentary in the literatures. Here we performed the computational genome-wide screening for potential CRMs using different prediction tools and constructed the pioneer database, cisMEP (cis-regulatory module epigenetic profile database), to integrate these computationally identified CRMs with genomic epigenetic profile data. cisMEP collects the literature-curated TFBS location data and nine genres of epigenetic data for assessing the confidence of these potential CRMs and deciphering the possible CRM functionality. Conclusions cisMEP aims to provide a user-friendly interface for researchers to assess the confidence of different potential CRMs and to understand the functions of CRMs through experimentally-identified epigenetic profiles. The deposited potential CRMs and experimental epigenetic profiles for confidence assessment provide experimentally testable
Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.).

PubMed

Yin, Tao; Wu, Hanying; Zhang, Shanglong; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming; Liu, Jingmei

2009-01-01

A 1.8 kb 5'-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from -986 to -959 and from -472 to -424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative beta-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were approximately 10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves.
Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.)

PubMed Central

Yin, Tao; Wu, Hanying; Zhang, Shanglong; Liu, Jingmei; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming

2009-01-01

A 1.8 kb 5′-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from –986 to –959 and from –472 to –424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative β-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were ∼10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves. PMID:19073962
A systematic analysis of a mi-RNA inter-pathway regulatory motif

PubMed Central

2013-01-01

Background The continuing discovery of new types and functions of small non-coding RNAs is suggesting the presence of regulatory mechanisms far more complex than the ones currently used to study and design Gene Regulatory Networks. Just focusing on the roles of micro RNAs (miRNAs), they have been found to be part of several intra-pathway regulatory motifs. However, inter-pathway regulatory mechanisms have been often neglected and require further investigation. Results In this paper we present the result of a systems biology study aimed at analyzing a high-level inter-pathway regulatory motif called Pathway Protection Loop, not previously described, in which miRNAs seem to play a crucial role in the successful behavior and activation of a pathway. Through the automatic analysis of a large set of public available databases, we found statistical evidence that this inter-pathway regulatory motif is very common in several classes of KEGG Homo Sapiens pathways and concurs in creating a complex regulatory network involving several pathways connected by this specific motif. The role of this motif seems also confirmed by a deeper review of other research activities on selected representative pathways. Conclusions Although previous studies suggested transcriptional regulation mechanism at the pathway level such as the Pathway Protection Loop, a high-level analysis like the one proposed in this paper is still missing. The understanding of higher-level regulatory motifs could, as instance, lead to new approaches in the identification of therapeutic targets because it could unveil new and “indirect” paths to activate or silence a target pathway. However, a lot of work still needs to be done to better uncover this high-level inter-pathway regulation including enlarging the analysis to other small non-coding RNA molecules. PMID:24152805
Assessment of composite motif discovery methods.

PubMed

Klepper, Kjetil; Sandve, Geir K; Abul, Osman; Johansen, Jostein; Drablos, Finn

2008-02-26

Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a
Genome-wide colonization of gene regulatory elements by G4 DNA motifs

PubMed Central

Du, Zhuo; Zhao, Yiqiang; Li, Ning

2009-01-01

G-quadruplex (or G4 DNA), a stable four-stranded structure found in guanine-rich regions, is implicated in the transcriptional regulation of genes involved in growth and development. Previous studies on the role of G4 DNA in gene regulation mostly focused on genomic regions proximal to transcription start sites (TSSs). To gain a more comprehensive understanding of the regulatory role of G4 DNA, we examined the landscape of potential G4 DNA (PG4Ms) motifs in the human genome and found that G4 motifs, not restricted to those found in the TSS-proximal regions, are bias toward gene-associated regions. Significantly, analyses of G4 motifs in seven types of well-known gene regulatory elements revealed a constitutive enrichment pattern and the clusters of G4 motifs tend to be colocalized with regulatory elements. Considering our analysis from a genome evolutionary perspective, we found evidence that the occurrence and accumulation of certain progenitors and canonical G4 DNA motifs within regulatory regions were progressively favored by natural selection. Our results suggest that G4 DNA motifs are ‘colonized’ in regulatory regions, supporting a likely genome-wide role of G4 DNA in gene regulation. We hypothesize that G4 DNA is a regulatory apparatus situated in regulatory elements, acting as a molecular switch that can modulate the role of the host functional regions, by transition in DNA structure. PMID:19759215
Global Profiling of Rice and Poplar Transcriptomes Highlights Key Conserved Circadian-Controlled Pathways and cis-Regulatory Modules

PubMed Central

Filichkin, Sergei A.; Breton, Ghislain; Priest, Henry D.; Dharmawardhana, Palitha; Jaiswal, Pankaj; Fox, Samuel E.; Michael, Todd P.; Chory, Joanne; Kay, Steve A.; Mockler, Todd C.

2011-01-01

Arabidopsis of three major classes of cis-regulatory modules within the plant circadian network: the morning (ME, GBOX), evening (EE, GATA), and midnight (PBX/TBX/SBX) modules. Identification of identical overrepresented motifs in the promoters of cycling genes from different species suggests that the core diurnal/circadian cis-regulatory network is deeply conserved between mono- and dicotyledonous species. PMID:21694767
Hydrogenation of fluoroarenes: Direct access to all-cis-(multi)fluorinated cycloalkanes.

PubMed

Wiesenfeldt, Mario P; Nairoukh, Zackaria; Li, Wei; Glorius, Frank

2017-09-01

All-c is -multifluorinated cycloalkanes exhibit intriguing electronic properties. In particular, they display extremely high dipole moments perpendicular to the aliphatic ring, making them highly desired motifs in material science. Very few such motifs have been prepared, as their syntheses require multistep sequences from diastereoselectively prefunctionalized precursors. Herein we report a synthetic strategy to access these valuable materials via the rhodium-cyclic (alkyl)(amino)carbene (CAAC)-catalyzed hydrogenation of readily available fluorinated arenes in hexane. This route enables the scalable single-step preparation of an abundance of multisubstituted and multifluorinated cycloalkanes, including all- cis -1,2,3,4,5,6-hexafluorocyclohexane as well as cis-configured fluorinated aliphatic heterocycles. Copyright © 2017, American Association for the Advancement of Science.
Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors

PubMed Central

Andersson, Claes R; Hvidsten, Torgeir R; Isaksson, Anders; Gustafsson, Mats G; Komorowski, Jan

2007-01-01

Background We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes. PMID:17939860
cis-Regulatory Mutations Are a Genetic Cause of Human Limb Malformations

PubMed Central

VanderMeer, Julia E.; Ahituv, Nadav

2011-01-01

The underlying mutations that cause human limb malformations are often difficult to determine, particularly for limb malformations that occur as isolated traits. Evidence from a variety of studies shows that cis-regulatory mutations, specifically in enhancers, can lead to some of these isolated limb malformations. Here, we provide a review of human limb malformations that have been shown to be caused by enhancer mutations and propose that cis-regulatory mutations will continue to be identified as the cause of additional human malformations as our understanding of regulatory sequences improves. PMID:21509892

Pathogenic adaptation of intracellular bacteria by rewiring a cis-regulatory input function.

PubMed

Osborne, Suzanne E; Walthers, Don; Tomljenovic, Ana M; Mulder, David T; Silphaduang, Uma; Duong, Nancy; Lowden, Michael J; Wickham, Mark E; Waller, Ross F; Kenney, Linda J; Coombes, Brian K

2009-03-10

The acquisition of DNA by horizontal gene transfer enables bacteria to adapt to previously unexploited ecological niches. Although horizontal gene transfer and mutation of protein-coding sequences are well-recognized forms of pathogen evolution, the evolutionary significance of cis-regulatory mutations in creating phenotypic diversity through altered transcriptional outputs is not known. We show the significance of regulatory mutation for pathogen evolution by mapping and then rewiring a cis-regulatory module controlling a gene required for murine typhoid. Acquisition of a binding site for the Salmonella pathogenicity island-2 regulator, SsrB, enabled the srfN gene, ancestral to the Salmonella genus, to play a role in pathoadaptation of S. typhimurium to a host animal. We identified the evolved cis-regulatory module and quantified the fitness gain that this regulatory output accrues for the bacterium using competitive infections of host animals. Our findings highlight a mechanism of pathogen evolution involving regulatory mutation that is selected because of the fitness advantage the new regulatory output provides the incipient clones.
Pathogenic adaptation of intracellular bacteria by rewiring a cis-regulatory input function

PubMed Central

Osborne, Suzanne E.; Walthers, Don; Tomljenovic, Ana M.; Mulder, David T.; Silphaduang, Uma; Duong, Nancy; Lowden, Michael J.; Wickham, Mark E.; Waller, Ross F.; Kenney, Linda J.; Coombes, Brian K.

2009-01-01

The acquisition of DNA by horizontal gene transfer enables bacteria to adapt to previously unexploited ecological niches. Although horizontal gene transfer and mutation of protein-coding sequences are well-recognized forms of pathogen evolution, the evolutionary significance of cis-regulatory mutations in creating phenotypic diversity through altered transcriptional outputs is not known. We show the significance of regulatory mutation for pathogen evolution by mapping and then rewiring a cis-regulatory module controlling a gene required for murine typhoid. Acquisition of a binding site for the Salmonella pathogenicity island-2 regulator, SsrB, enabled the srfN gene, ancestral to the Salmonella genus, to play a role in pathoadaptation of S. typhimurium to a host animal. We identified the evolved cis-regulatory module and quantified the fitness gain that this regulatory output accrues for the bacterium using competitive infections of host animals. Our findings highlight a mechanism of pathogen evolution involving regulatory mutation that is selected because of the fitness advantage the new regulatory output provides the incipient clones. PMID:19234126
Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties of their dynamical state space.

PubMed

Ahnert, S E; Fink, T M A

2016-07-01

Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the 'function' of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. © 2016 The Authors.
The twilight zone of cis element alignments.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-02-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
Chaotic Motifs in Gene Regulatory Networks

PubMed Central

Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

2012-01-01

Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs. PMID:22792171
Parallel evolution of chordate cis-regulatory code for development.

PubMed

Doglio, Laura; Goode, Debbie K; Pelleri, Maria C; Pauls, Stefan; Frabetti, Flavia; Shimeld, Sebastian M; Vavouri, Tanya; Elgar, Greg

2013-11-01

Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.
Network-directed cis-mediator analysis of normal prostate tissue expression profiles reveals downstream regulatory associations of prostate cancer susceptibility loci.

PubMed

Larson, Nicholas B; McDonnell, Shannon K; Fogarty, Zach; Larson, Melissa C; Cheville, John; Riska, Shaun; Baheti, Saurabh; Weber, Alexandra M; Nair, Asha A; Wang, Liang; O'Brien, Daniel; Davila, Jaime; Schaid, Daniel J; Thibodeau, Stephen N

2017-10-17

Large-scale genome-wide association studies have identified multiple single-nucleotide polymorphisms associated with risk of prostate cancer. Many of these genetic variants are presumed to be regulatory in nature; however, follow-up expression quantitative trait loci (eQTL) association studies have to-date been restricted largely to cis -acting associations due to study limitations. While trans -eQTL scans suffer from high testing dimensionality, recent evidence indicates most trans -eQTL associations are mediated by cis -regulated genes, such as transcription factors. Leveraging a data-driven gene co-expression network, we conducted a comprehensive cis -mediator analysis using RNA-Seq data from 471 normal prostate tissue samples to identify downstream regulatory associations of previously identified prostate cancer risk variants. We discovered multiple trans -eQTL associations that were significantly mediated by cis -regulated transcripts, four of which involved risk locus 17q12, proximal transcription factor HNF1B , and target trans -genes with known HNF response elements ( MIA2 , SRC , SEMA6A , KIF12 ). We additionally identified evidence of cis -acting down-regulation of MSMB via rs10993994 corresponding to reduced co-expression of NDRG1 . The majority of these cis -mediator relationships demonstrated trans -eQTL replicability in 87 prostate tissue samples from the Gene-Tissue Expression Project. These findings provide further biological context to known risk loci and outline new hypotheses for investigation into the etiology of prostate cancer.
Identifying Cis-Regulatory Changes Involved in the Evolution of Aerobic Fermentation in Yeasts

PubMed Central

Lin, Zhenguo; Wang, Tzi-Yuan; Tsai, Bing-Shi; Wu, Fang-Ting; Yu, Fu-Jung; Tseng, Yu-Jung; Sung, Huang-Mo; Li, Wen-Hsiung

2013-01-01

Gene regulation change has long been recognized as an important mechanism for phenotypic evolution. We used the evolution of yeast aerobic fermentation as a model to explore how gene regulation has evolved and how this process has contributed to phenotypic evolution and adaptation. Most eukaryotes fully oxidize glucose to CO2 and H2O in mitochondria to maximize energy yield, whereas some yeasts, such as Saccharomyces cerevisiae and its relatives, predominantly ferment glucose into ethanol even in the presence of oxygen, a phenomenon known as aerobic fermentation. We examined the genome-wide gene expression levels among 12 different yeasts and found that a group of genes involved in the mitochondrial respiration process showed the largest reduction in gene expression level during the evolution of aerobic fermentation. Our analysis revealed that the downregulation of these genes was significantly associated with massive loss of binding motifs of Cbf1p in the fermentative yeasts. Our experimental assays confirmed the binding of Cbf1p to the predicted motif and the activator role of Cbf1p. In summary, our study laid a foundation to unravel the long-time mystery about the genetic basis of evolution of aerobic fermentation, providing new insights into understanding the role of cis-regulatory changes in phenotypic evolution. PMID:23650209
cis-Regulatory control of the initial neurogenic pattern of onecut gene expression in the sea urchin embryo.

PubMed

Barsi, Julius C; Davidson, Eric H

2016-01-01

Specification of the ciliated band (CB) of echinoid embryos executes three spatial functions essential for postgastrular organization. These are establishment of a band about 5 cells wide which delimits and bounds other embryonic territories; definition of a neurogenic domain within this band; and generation within it of arrays of ciliary cells that bear the special long cilia from which the structure derives its name. In Strongylocentrotus purpuratus the spatial coordinates of the future ciliated band are initially and exactly determined by the disposition of a ring of cells that transcriptionally activate the onecut homeodomain regulatory gene, beginning in blastula stage, long before the appearance of the CB per se. Thus the cis-regulatory apparatus that governs onecut expression in the blastula directly reveals the genomic sequence code by which these aspects of the spatial organization of the embryo are initially determined. We screened the entire onecut locus and its flanking region for transcriptionally active cis-regulatory elements, and by means of BAC recombineered deletions identified three separated and required cis-regulatory modules that execute different functions. The operating logic of the crucial spatial control module accounting for the spectacularly precise and beautiful early onecut expression domain depends on spatial repression. Previously predicted oral ectoderm and aboral ectoderm repressors were identified by cis-regulatory mutation as the products of goosecoid and irxa genes respectively, while the pan-ectodermal activator SoxB1 supplies a transcriptional driver function. Copyright © 2015. Published by Elsevier Inc.
Sequence information gain based motif analysis.

PubMed

Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre

2015-11-09

The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

PubMed

Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

2018-05-31

In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.
Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice.

PubMed

Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun

2015-12-11

As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice.
Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells

PubMed Central

Freire-Pritchett, Paula; Schoenfelder, Stefan; Várnai, Csilla; Wingett, Steven W; Cairns, Jonathan; Collier, Amanda J; García-Vílchez, Raquel; Furlan-Magaril, Mayra; Osborne, Cameron S; Fraser, Peter; Rugg-Gunn, Peter J; Spivakov, Mikhail

2017-01-01

Long-range cis-regulatory elements such as enhancers coordinate cell-specific transcriptional programmes by engaging in DNA looping interactions with target promoters. Deciphering the interplay between the promoter connectivity and activity of cis-regulatory elements during lineage commitment is crucial for understanding developmental transcriptional control. Here, we use Promoter Capture Hi-C to generate a high-resolution atlas of chromosomal interactions involving ~22,000 gene promoters in human pluripotent and lineage-committed cells, identifying putative target genes for known and predicted enhancer elements. We reveal extensive dynamics of cis-regulatory contacts upon lineage commitment, including the acquisition and loss of promoter interactions. This spatial rewiring occurs preferentially with predicted changes in the activity of cis-regulatory elements and is associated with changes in target gene expression. Our results provide a global and integrated view of promoter interactome dynamics during lineage commitment of human pluripotent cells. DOI: http://dx.doi.org/10.7554/eLife.21926.001 PMID:28332981
Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

PubMed Central

O'Connor, Timothy R.; Bailey, Timothy L.

2014-01-01

Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088
The twilight zone of cis element alignments

PubMed Central

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-01-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

PubMed

De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

2015-12-01

The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

PubMed Central

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-01-01

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. DMATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the coregulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sosbox cisregulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. DMATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861
D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

PubMed

Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

2009-07-27

Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/
Cis-regulatory landscapes of four cell types of the retina

PubMed Central

Hartl, Dominik; Jüttner, Josephine

2017-01-01

Abstract The retina is composed of ∼50 cell-types with specific functions for the process of vision. Identification of the cis-regulatory elements active in retinal cell-types is key to elucidate the networks controlling this diversity. Here, we combined transcriptome and epigenome profiling to map the regulatory landscape of four cell-types isolated from mouse retinas including rod and cone photoreceptors as well as rare inter-neuron populations such as horizontal and starburst amacrine cells. Integration of this information reveals sequence determinants and candidate transcription factors for controlling cellular specialization. Additionally, we refined parallel reporter assays to enable studying the transcriptional activity of large collection of sequences in individual cell-types isolated from a tissue. We provide proof of concept for this approach and its scalability by characterizing the transcriptional capacity of several hundred putative regulatory sequences within individual retinal cell-types. This generates a catalogue of cis-regulatory regions active in retinal cell types and we further demonstrate their utility as potential resource for cellular tagging and manipulation. PMID:29059322
A minimal murine Msx-1 gene promoter. Organization of its cis-regulatory motifs and their role in transcriptional activation in cells in culture and in transgenic mice.

PubMed

Takahashi, T; Guron, C; Shetty, S; Matsui, H; Raghow, R

1997-09-05

To dissect the cis-regulatory elements of the murine Msx-1 promoter, which lacks a conventional TATA element, a putative Msx-1 promoter DNA fragment (from -1282 to +106 base pairs (bp)) or its congeners containing site-specific alterations were fused to luciferase reporter and introduced into NIH3T3 and C2C12 cells, and the expression of luciferase was assessed in transient expression assays. The functional consequences of the sequential 5' deletions of the promotor revealed that multiple positive and negative regulatory elements participate in regulating transcription of the Msx-1 gene. Surprisingly, however, the optimal expression of Msx-1 promoter in either NIH3T3 or C2C12 cells required only 165 bp of the upstream sequence to warrant detailed examination of its structure. Therefore, the functional consequences of site-specific deletions and point mutations of the cis-acting elements of the minimal Msx-1 promoter were systematically examined. Concomitantly, potential transcriptional factor(s) interacting with the cis-acting elements of the minimal promoter were also studied by gel electrophoretic mobility shift assays and DNase I footprinting. Combined analyses of the minimal promoter by DNase I footprinting, electrophoretic mobility shift assays, and super shift assays with specific antibodies revealed that 5'-flanking regions from -161 to -154 and from -26 to -13 of the Msx-1 promoter contains an authentic E box (proximal E box), capable of binding a protein immunologically related to the upstream stimulating factor 1 (USF-1) and a GC-rich sequence motif which can bind to Sp1 (proximal Sp1), respectively. Additionally, we observed that the promoter activation was seriously hampered if the proximal E box was removed or mutated, and the promoter activity was eliminated completely if the proximal Sp1 site was similarly altered. Absolute dependence of the Msx-1 minimal promoter on Sp1 could be demonstrated by transient expression assays in the Sp1-deficient

BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

PubMed Central

De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

2015-01-01

Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488
Organization of cis-acting regulatory elements in osmotic- and cold-stress-responsive promoters.

PubMed

Yamaguchi-Shinozaki, Kazuko; Shinozaki, Kazuo

2005-02-01

cis-Acting regulatory elements are important molecular switches involved in the transcriptional regulation of a dynamic network of gene activities controlling various biological processes, including abiotic stress responses, hormone responses and developmental processes. In particular, understanding regulatory gene networks in stress response cascades depends on successful functional analyses of cis-acting elements. The ever-improving accuracy of transcriptome expression profiling has led to the identification of various combinations of cis-acting elements in the promoter regions of stress-inducible genes involved in stress and hormone responses. Here we discuss major cis-acting elements, such as the ABA-responsive element (ABRE) and the dehydration-responsive element/C-repeat (DRE/CRT), that are a vital part of ABA-dependent and ABA-independent gene expression in osmotic and cold stress responses.
Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

PubMed Central

Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

2015-01-01

Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

PubMed Central

Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

2012-01-01

A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985
Using reporter gene assays to identify cis regulatory differences between humans and chimpanzees.

PubMed

Chabot, Adrien; Shrit, Ralla A; Blekhman, Ran; Gilad, Yoav

2007-08-01

Most phenotypic differences between human and chimpanzee are likely to result from differences in gene regulation, rather than changes to protein-coding regions. To date, however, only a handful of human-chimpanzee nucleotide differences leading to changes in gene regulation have been identified. To hone in on differences in regulatory elements between human and chimpanzee, we focused on 10 genes that were previously found to be differentially expressed between the two species. We then designed reporter gene assays for the putative human and chimpanzee promoters of the 10 genes. Of seven promoters that we found to be active in human liver cell lines, human and chimpanzee promoters had significantly different activity in four cases, three of which recapitulated the gene expression difference seen in the microarray experiment. For these three genes, we were therefore able to demonstrate that a change in cis influences expression differences between humans and chimpanzees. Moreover, using site-directed mutagenesis on one construct, the promoter for the DDA3 gene, we were able to identify three nucleotides that together lead to a cis regulatory difference between the species. High-throughput application of this approach can provide a map of regulatory element differences between humans and our close evolutionary relatives.
CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data

PubMed Central

O'Connor, Timothy; Bodén, Mikael

2017-01-01

Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599
Cis-regulatory landscapes of four cell types of the retina.

PubMed

Hartl, Dominik; Krebs, Arnaud R; Jüttner, Josephine; Roska, Botond; Schübeler, Dirk

2017-11-16

The retina is composed of ∼50 cell-types with specific functions for the process of vision. Identification of the cis-regulatory elements active in retinal cell-types is key to elucidate the networks controlling this diversity. Here, we combined transcriptome and epigenome profiling to map the regulatory landscape of four cell-types isolated from mouse retinas including rod and cone photoreceptors as well as rare inter-neuron populations such as horizontal and starburst amacrine cells. Integration of this information reveals sequence determinants and candidate transcription factors for controlling cellular specialization. Additionally, we refined parallel reporter assays to enable studying the transcriptional activity of large collection of sequences in individual cell-types isolated from a tissue. We provide proof of concept for this approach and its scalability by characterizing the transcriptional capacity of several hundred putative regulatory sequences within individual retinal cell-types. This generates a catalogue of cis-regulatory regions active in retinal cell types and we further demonstrate their utility as potential resource for cellular tagging and manipulation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Conservation and diversity in the cis-regulatory networks that integrate information controlling expression of Hoxa2 in hindbrain and cranial neural crest cells in vertebrates.

PubMed

Tümpel, Stefan; Maconochie, Mark; Wiedemann, Leanne M; Krumlauf, Robb

2002-06-01

The Hoxa2 and Hoxb2 genes are members of paralogy group II and display segmental patterns of expression in the developing vertebrate hindbrain and cranial neural crest cells. Functional analyses have demonstrated that these genes play critical roles in regulating morphogenetic pathways that direct the regional identity and anteroposterior character of hindbrain rhombomeres and neural crest-derived structures. Transgenic regulatory studies have also begun to characterize enhancers and cis-elements for those mouse and chicken genes that direct restricted patterns of expression in the hindbrain and neural crest. In light of the conserved role of Hoxa2 in neural crest patterning in vertebrates and the similarities between paralogs, it is important to understand the extent to which common regulatory networks and elements have been preserved between species and between paralogs. To investigate this problem, we have cloned and sequenced the intergenic region between Hoxa2 and Hoxa3 in the chick HoxA complex and used it for making comparative analyses with the respective human, mouse, and horn shark regions. We have also used transgenic assays in mouse and chick embryos to test the functional activity of Hoxa2 enhancers in heterologous species. Our analysis reveals that three of the critical individual components of the Hoxa2 enhancer region from mouse necessary for hindbrain expression (Krox20, BoxA, and TCT motifs) have been partially conserved. However, their number and organization are highly varied for the same gene in different species and between paralogs within a species. Other essential mouse elements appear to have diverged or are absent in chick and shark. We find the mouse r3/r5 enhancer fails to work in chick embryos and the chick enhancer works poorly in mice. This implies that new motifs have been recruited or utilized to mediate restricted activity of the enhancer in other species. With respect to neural crest regulation, cis-components are embedded among
9-cis-retinoic acid represses estrogen-induced expression of the very low density apolipoprotein II gene.

PubMed

Schippers, I J; Kloppenburg, M; Snippe, L; Ab, G

1994-11-01

The chicken very low density apolipoprotein II (apoVLDLII) gene is estrogen-inducible and specifically expressed in liver. We examined the possible involvement of the retinoid X receptor (RXR) and its ligand 9-cis-retinoic acid (9-cis-RA) in the activation of the apoVLDLII promoter. We first concentrated on a potential RXR recognition site, which deviates at only one position from a perfect direct A/GGGTCA repeat spaced by one nucleotide (DR-1) and was earlier identified as a common HNF-4/COUP-TF recognition site. However, band shift analysis revealed that this imperfect DR-1 motif does not interact with RXR alpha-homodimers. In accordance with this observation we found that this regulatory element does not mediate transactivation through RXR alpha in the presence of 9-cis-RA. However, our experiments revealed another, unexpected, effect of 9-cis-RA. Instead of stimulating, 9-cis-RA attenuated estrogen-induced expression of transfected estrogen-responsive VLDL-CAT reporter plasmids. This repression appeared to take place through the main estrogen response element (ERE) of the gene. Importantly, 9-cis-RA also strongly repressed the estrogen-induced expression of the endogenous apoVLDLII gene in cultured chicken hepatoma cells.
Expression, subcellular localization, and cis-regulatory structure of duplicated phytoene synthase genes in melon (Cucumis melo L.).

PubMed

Qin, Xiaoqiong; Coku, Ardian; Inoue, Kentaro; Tian, Li

2011-10-01

Carotenoids perform many critical functions in plants, animals, and humans. It is therefore important to understand carotenoid biosynthesis and its regulation in plants. Phytoene synthase (PSY) catalyzes the first committed and rate-limiting step in carotenoid biosynthesis. While PSY is present as a single copy gene in Arabidopsis, duplicated PSY genes have been identified in many economically important monocot and dicot crops. CmPSY1 was previously identified from melon (Cucumis melo L.), but was not functionally characterized. We isolated a second PSY gene, CmPSY2, from melon in this work. CmPSY2 possesses a unique intron/exon structure that has not been observed in other plant PSYs. Both CmPSY1 and CmPSY2 are functional in vitro, but exhibit distinct expression patterns in different melon tissues and during fruit development, suggesting differential regulation of the duplicated melon PSY genes. In vitro chloroplast import assays verified the plastidic localization of CmPSY1 and CmPSY2 despite the lack of an obvious plastid target peptide in CmPSY2. Promoter motif analysis of the duplicated melon and tomato PSY genes and the Arabidopsis PSY revealed distinctive cis-regulatory structures of melon PSYs and identified gibberellin-responsive motifs in all PSYs except for SlPSY1, which has not been reported previously. Overall, these data provide new insights into the evolutionary history of plant PSY genes and the regulation of PSY expression by developmental and environmental signals that may involve different regulatory networks.
Putative bovine topological association domains and CTCF binding motifs can reduce the search space for causative regulatory variants of complex traits.

PubMed

Wang, Min; Hancock, Timothy P; Chamberlain, Amanda J; Vander Jagt, Christy J; Pryce, Jennie E; Cocks, Benjamin G; Goddard, Mike E; Hayes, Benjamin J

2018-05-24

Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants. We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001). Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative
CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors

PubMed Central

Corbo, Joseph C.; Lawrence, Karen A.; Karlstetter, Marcus; Myers, Connie A.; Abdelaziz, Musa; Dirkes, William; Weigelt, Karin; Seifert, Martin; Benes, Vladimir; Fritsche, Lars G.; Weber, Bernhard H.F.; Langmann, Thomas

2010-01-01

Approximately 98% of mammalian DNA is noncoding, yet we understand relatively little about the function of this enigmatic portion of the genome. The cis-regulatory elements that control gene expression reside in noncoding regions and can be identified by mapping the binding sites of tissue-specific transcription factors. Cone-rod homeobox (CRX) is a key transcription factor in photoreceptor differentiation and survival, but its in vivo targets are largely unknown. Here, we used chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) on CRX to identify thousands of cis-regulatory regions around photoreceptor genes in adult mouse retina. CRX directly regulates downstream photoreceptor transcription factors and their target genes via a network of spatially distributed regulatory elements around each locus. CRX-bound regions act in a synergistic fashion to activate transcription and contain multiple CRX binding sites which interact in a spacing- and orientation-dependent manner to fine-tune transcript levels. CRX ChIP-seq was also performed on Nrl−/− retinas, which represent an enriched source of cone photoreceptors. Comparison with the wild-type ChIP-seq data set identified numerous rod- and cone-specific CRX-bound regions as well as many shared elements. Thus, CRX combinatorially orchestrates the transcriptional networks of both rods and cones by coordinating the expression of photoreceptor genes including most retinal disease genes. In addition, this study pinpoints thousands of noncoding regions of relevance to both Mendelian and complex retinal disease. PMID:20693478
Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

PubMed Central

Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

2014-01-01

Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522
In silico evolution of the hunchback gene indicates redundancy in cis-regulatory organization and spatial gene expression

PubMed Central

Zagrijchuk, Elizaveta A.; Sabirov, Marat A.; Holloway, David M.; Spirov, Alexander V.

2014-01-01

Biological development depends on the coordinated expression of genes in time and space. Developmental genes have extensive cis-regulatory regions which control their expression. These regions are organized in a modular manner, with different modules controlling expression at different times and locations. Both how modularity evolved and what function it serves are open questions. We present a computational model for the cis-regulation of the hunchback (hb) gene in the fruit fly (Drosophila). We simulate evolution (using an evolutionary computation approach from computer science) to find the optimal cis-regulatory arrangements for fitting experimental hb expression patterns. We find that the cis-regulatory region tends to readily evolve modularity. These cis-regulatory modules (CRMs) do not tend to control single spatial domains, but show a multi-CRM/multi-domain correspondence. We find that the CRM-domain correspondence seen in Drosophila evolves with a high probability in our model, supporting the biological relevance of the approach. The partial redundancy resulting from multi-CRM control may confer some biological robustness against corruption of regulatory sequences. The technique developed on hb could readily be applied to other multi-CRM developmental genes. PMID:24712536
Deep conservation of cis-regulatory elements in metazoans

PubMed Central

Maeso, Ignacio; Irimia, Manuel; Tena, Juan J.; Casares, Fernando; Gómez-Skarmeta, José Luis

2013-01-01

Despite the vast morphological variation observed across phyla, animals share multiple basic developmental processes orchestrated by a common ancestral gene toolkit. These genes interact with each other building complex gene regulatory networks (GRNs), which are encoded in the genome by cis-regulatory elements (CREs) that serve as computational units of the network. Although GRN subcircuits involved in ancient developmental processes are expected to be at least partially conserved, identification of CREs that are conserved across phyla has remained elusive. Here, we review recent studies that revealed such deeply conserved CREs do exist, discuss the difficulties associated with their identification and describe new approaches that will facilitate this search. PMID:24218633
Cis-acting elements in its 3′ UTR mediate post-transcriptional regulation of KRAS

PubMed Central

Kim, Minlee; Kogan, Nicole; Slack, Frank J.

2016-01-01

Multiple RNA-binding proteins and non-coding RNAs, such as microRNAs (miRNAs), are involved in post-transcriptional gene regulation through recognition motifs in the 3′ untranslated region (UTR) of their target genes. The KRAS gene encodes a key signaling protein, and its messenger RNA (mRNA) contains an exceptionally long 3′ UTR; this suggests that it may be subject to a highly complex set of regulatory processes. However, 3′ UTR-dependent regulation of KRAS expression has not been explored in detail. Using extensive deletion and mutational analyses combined with luciferase reporter assays, we have identified inhibitory and stabilizing cis-acting regions within the KRAS 3′ UTR that may interact with miRNAs and RNA-binding proteins, such as HuR. Particularly, we have identified an AU-rich 49-nt fragment in the KRAS 3′ UTR that is required for KRAS 3′ UTR reporter repression. This element contains a miR-185 complementary element, and we show that overexpression of miR-185 represses endogenous KRAS mRNA and protein in vitro. In addition, we have identified another 49-nt fragment that is required to promote KRAS 3′ UTR reporter expression. These findings indicate that multiple cis-regulatory motifs in the 3′ UTR of KRAS finely modulate its expression, and sequence alterations within a binding motif may disrupt the precise functions of trans-regulatory factors, potentially leading to aberrant KRAS expression. PMID:26930719
The identification of cis-regulatory elements: A review from a machine learning perspective.

PubMed

Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

2015-12-01

The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
Detecting cis-regulatory binding sites for cooperatively binding proteins

PubMed Central

van Oeffelen, Liesbeth; Cornelis, Pierre; Van Delm, Wouter; De Ridder, Fedor; De Moor, Bart; Moreau, Yves

2008-01-01

Several methods are available to predict cis-regulatory modules in DNA based on position weight matrices. However, the performance of these methods generally depends on a number of additional parameters that cannot be derived from sequences and are difficult to estimate because they have no physical meaning. As the best way to detect cis-regulatory modules is the way in which the proteins recognize them, we developed a new scoring method that utilizes the underlying physical binding model. This method requires no additional parameter to account for multiple binding sites; and the only necessary parameters to model homotypic cooperative interactions are the distances between adjacent protein binding sites in basepairs, and the corresponding cooperative binding constants. The heterotypic cooperative binding model requires one more parameter per cooperatively binding protein, which is the concentration multiplied by the partition function of this protein. In a case study on the bacterial ferric uptake regulator, we show that our scoring method for homotypic cooperatively binding proteins significantly outperforms other PWM-based methods where biophysical cooperativity is not taken into account. PMID:18400778
Identification of a transient Sox5 expressing progenitor population in the neonatal ventral forebrain by a novel cis-regulatory element

PubMed Central

Hao, Hailing; Li, Ying; Tzatzalos, Evangeline; Gilbert, Jordana; Zala, Dhara; Bhaumik, Mantu; Cai, Li

2014-01-01

Precise control of lineage-specific gene expression in the neural stem/progenitor cells is crucial for generation of the diversity of neuronal and glial cell types in the central nervous system (CNS). The mechanism underlying such gene regulation, however, is not fully elucidated. Here, we report that a 377 bp evolutionarily conserved DNA fragment (CR5), located approximately 32 kbp upstream of Olig2 transcription start site, acts as a cis-regulator for gene expression in the development of the neonatal forebrain. CR5 is active in a time-specific and brain region-restricted manner. CR5 activity is not detected in the embryonic stage, but it is exclusively in a subset of Sox5+ cells in the neonatal ventral forebrain. Furthermore, we show that Sox5 binding motif in CR5 is important for this cell-specific gene regulatory activity; mutation of Sox5 binding motif in CR5 alters reporter gene expression with different cellular composition. Together, our study provides new insights into the regulation of cell-specific gene expression during CNS development. PMID:24954155
Function, dynamics and evolution of network motif modules in integrated gene regulatory networks of worm and plant.

PubMed

Defoort, Jonas; Van de Peer, Yves; Vermeirssen, Vanessa

2018-06-05

Gene regulatory networks (GRNs) consist of different molecular interactions that closely work together to establish proper gene expression in time and space. Especially in higher eukaryotes, many questions remain on how these interactions collectively coordinate gene regulation. We study high quality GRNs consisting of undirected protein-protein, genetic and homologous interactions, and directed protein-DNA, regulatory and miRNA-mRNA interactions in the worm Caenorhabditis elegans and the plant Arabidopsis thaliana. Our data-integration framework integrates interactions in composite network motifs, clusters these in biologically relevant, higher-order topological network motif modules, overlays these with gene expression profiles and discovers novel connections between modules and regulators. Similar modules exist in the integrated GRNs of worm and plant. We show how experimental or computational methodologies underlying a certain data type impact network topology. Through phylogenetic decomposition, we found that proteins of worm and plant tend to functionally interact with proteins of a similar age, while at the regulatory level TFs favor same age, but also older target genes. Despite some influence of the duplication mode difference, we also observe at the motif and module level for both species a preference for age homogeneity for undirected and age heterogeneity for directed interactions. This leads to a model where novel genes are added together to the GRNs in a specific biological functional context, regulated by one or more TFs that also target older genes in the GRNs. Overall, we detected topological, functional and evolutionary properties of GRNs that are potentially universal in all species.

iFORM: Incorporating Find Occurrence of Regulatory Motifs.

PubMed

Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie

2016-01-01

Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.
TrawlerWeb: an online de novo motif discovery tool for next-generation sequencing datasets.

PubMed

Dang, Louis T; Tondl, Markus; Chiu, Man Ho H; Revote, Jerico; Paten, Benedict; Tano, Vincent; Tokolyi, Alex; Besse, Florence; Quaife-Ryan, Greg; Cumming, Helen; Drvodelic, Mark J; Eichenlaub, Michael P; Hallab, Jeannette C; Stolper, Julian S; Rossello, Fernando J; Bogoyevitch, Marie A; Jans, David A; Nim, Hieu T; Porrello, Enzo R; Hudson, James E; Ramialison, Mirana

2018-04-05

A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57-74, 2012; Nat 507:462-70, 2014; Nat 507:455-61, 2014; Nat 518:317-30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users. We present TrawlerWeb, a web-based version of the Trawler_standalone tool (Nat Methods 4:563-5, 2007; Nat Protoc 5:323-34, 2010), to allow for the identification of enriched motifs in DNA sequences obtained from next-generation sequencing experiments in order to predict their TFBS composition. TrawlerWeb is designed for online queries with standard options common to web-based motif discovery tools. In addition, TrawlerWeb provides three unique new features: 1) TrawlerWeb allows the input of BED files directly generated from NGS experiments, 2) it automatically generates an input-matched biologically relevant background, and 3) it displays resulting conservation scores for each instance of the motif found in the input sequences, which assists the researcher in prioritising the motifs to validate experimentally. Finally, to date, this web-based version of Trawler_standalone remains the fastest online de novo motif discovery tool compared to other popular web-based software, while generating predictions with high accuracy. TrawlerWeb provides users with a fast, simple and easy-to-use web
A conserved C-terminal RXG motif in the NgBR subunit of cis-prenyltransferase is critical for prenyltransferase activity.

PubMed

Grabińska, Kariona A; Edani, Ban H; Park, Eon Joo; Kraehling, Jan R; Sessa, William C

2017-10-20

cis -Prenyltransferases ( cis -PTs) constitute a large family of enzymes conserved during evolution and present in all domains of life. In eukaryotes and archaea, cis -PT is the first enzyme committed to the synthesis of dolichyl phosphate, an obligate lipid carrier in protein glycosylation reactions. The homodimeric bacterial enzyme, undecaprenyl diphosphate synthase, generates 11 isoprene units and has been structurally and mechanistically characterized in great detail. Recently, we discovered that unlike undecaprenyl diphosphate synthase, mammalian cis -PT is a heteromer consisting of NgBR (Nus1) and hCIT (dehydrodolichol diphosphate synthase) subunits, and this composition has been confirmed in plants and fungal cis -PTs. Here, we establish the first purification system for heteromeric cis -PT and show that both NgBR and hCIT subunits function in catalysis and substrate binding. Finally, we identified a critical R X G sequence in the C-terminal tail of NgBR that is conserved and essential for enzyme activity across phyla. In summary, our findings show that eukaryotic cis -PT is composed of the NgBR and hCIT subunits. The strong conservation of the R X G motif among NgBR orthologs indicates that this subunit is critical for the synthesis of polyprenol diphosphates and cellular function. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
A novel approach to identifying regulatory motifs in distantly related genomes

PubMed Central

Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

2005-01-01

Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672
Overview Article: Identifying transcriptional cis-regulatory modules in animal genomes

PubMed Central

Suryamohan, Kushal; Halfon, Marc S.

2014-01-01

Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908
Optimized mixed Markov models for motif identification

PubMed Central

Huang, Weichun; Umbach, David M; Ohler, Uwe; Li, Leping

2006-01-01

Background Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. Results We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at [1]. Conclusion Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods. PMID:16749929
RSAT 2015: Regulatory Sequence Analysis Tools

PubMed Central

Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

2015-01-01

RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632
RSAT 2018: regulatory sequence analysis tools 20th anniversary.

PubMed

Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques; Medina-Rivera, Alejandra; Thomas-Chollier, Morgane

2018-05-02

RSAT (Regulatory Sequence Analysis Tools) is a suite of modular tools for the detection and the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, including from genome-wide datasets like ChIP-seq/ATAC-seq, (ii) motif scanning, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations, (v) comparative genomics. Six public servers jointly support 10 000 genomes from all kingdoms. Six novel or refactored programs have been added since the 2015 NAR Web Software Issue, including updated programs to analyse regulatory variants (retrieve-variation-seq, variation-scan, convert-variations), along with tools to extract sequences from a list of coordinates (retrieve-seq-bed), to select motifs from motif collections (retrieve-matrix), and to extract orthologs based on Ensembl Compara (get-orthologs-compara). Three use cases illustrate the integration of new and refactored tools to the suite. This Anniversary update gives a 20-year perspective on the software suite. RSAT is well-documented and available through Web sites, SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services, virtual machines and stand-alone programs at http://www.rsat.eu/.
Cis-Regulatory Changes Associated with a Recent Mating System Shift and Floral Adaptation in Capsella

PubMed Central

Steige, Kim A.; Reimegård, Johan; Koenig, Daniel; Scofield, Douglas G.; Slotte, Tanja

2015-01-01

The selfing syndrome constitutes a suite of floral and reproductive trait changes that have evolved repeatedly across many evolutionary lineages in response to the shift to selfing. Convergent evolution of the selfing syndrome suggests that these changes are adaptive, yet our understanding of the detailed molecular genetic basis of the selfing syndrome remains limited. Here, we investigate the role of cis-regulatory changes during the recent evolution of the selfing syndrome in Capsella rubella, which split from the outcrosser Capsella grandiflora less than 200 ka. We assess allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. Using a hierarchical Bayesian approach that accounts for technical variation using genomic reads, we find evidence for extensive cis-regulatory changes. On average, 44% of the assayed genes show evidence of ASE; however, only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of cis-regulatory changes in genomic regions responsible for floral and reproductive trait divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions near genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced expression of nearby genes. Our results suggest that cis-regulatory changes have been important during the recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be important for rapid regulatory divergence in association with mating system shifts. PMID:26318184
Mapping cis- and trans-regulatory effects across multiple tissues in twins

PubMed Central

Grundberg, Elin; Small, Kerrin S.; Hedman, Åsa K.; Nica, Alexandra C.; Buil, Alfonso; Keildson, Sarah; Bell, Jordana T.; Yang, Tsun-Po; Meduri, Eshwar; Barrett, Amy; Nisbett, James; Sekowska, Magdalena; Wilk, Alicja; Shin, So-Youn; Glass, Daniel; Travers, Mary; Min, Josine L.; Ring, Sue; Ho, Karen; Thorleifsson, Gudmar; Kong, Augustine; Thorsteindottir, Unnur; Ainali, Chrysanthi; Dimas, Antigone S.; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lowe, Christopher E.; Di Meglio, Paola; Montgomery, Stephen B.; Parts, Leopold; Potter, Simon; Surdulescu, Gabriela; Tsaprouni, Loukia; Tsoka, Sophia; Bataille, Veronique; Durbin, Richard; Nestle, Frank O.; O’Rahilly, Stephen; Soranzo, Nicole; Lindgren, Cecilia M.; Zondervan, Krina T.; Ahmadi, Kourosh R.; Schadt, Eric E.; Stefansson, Kari; Smith, George Davey; McCarthy, Mark I.; Deloukas, Panos; Dermitzakis, Emmanouil T.; Spector, Tim D.

2013-01-01

Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many eQTL studies typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis-effect on expression cannot be accounted for by common cis-variants, a finding which exposes the contribution of low frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene and identify several replicating trans-variants which act predominantly in a tissue-restricted manner and may regulate the transcription of many genes. PMID:22941192
MotifMark: Finding regulatory motifs in DNA sequences.

PubMed

Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

2017-07-01

The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Using SCOPE to identify potential regulatory motifs in coregulated genes.

PubMed

Martyanov, Viktor; Gross, Robert H

2011-05-31

SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from
Edge usage, motifs, and regulatory logic for cell cycling genetic networks

NASA Astrophysics Data System (ADS)

Zagorski, M.; Krzywicki, A.; Martin, O. C.

2013-01-01

The cell cycle is a tightly controlled process, yet it shows marked differences across species. Which of its structural features follow solely from the ability to control gene expression? We tackle this question in silico by examining the ensemble of all regulatory networks which satisfy the constraint of producing a given sequence of gene expressions. We focus on three cell cycle profiles coming from baker's yeast, fission yeast, and mammals. First, we show that the networks in each of the ensembles use just a few interactions that are repeatedly reused as building blocks. Second, we find an enrichment in network motifs that is similar in the two yeast cell cycle systems investigated. These motifs do not have autonomous functions, yet they reveal a regulatory logic for cell cycling based on a feed-forward cascade of activating interactions.
Genome-wide computational analysis reveals cardiomyocyte-specific transcriptional Cis-regulatory motifs that enable efficient cardiac gene therapy.

PubMed

Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K

2015-01-01

Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a "molecular signature" associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy.
Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.

PubMed

Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N

2013-03-15

The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter
Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1

PubMed Central

Rombauts, Stephane; Florquin, Kobe; Lescot, Magali; Marchal, Kathleen; Rouzé, Pierre; Van de Peer, Yves

2003-01-01

The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called “search by signal” methods) and the delineation of promoters by considering both sequence content and structural features (“search by content” methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5′-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be
Lmx1b-targeted cis-regulatory modules involved in limb dorsalization.

PubMed

Haro, Endika; Watson, Billy A; Feenstra, Jennifer M; Tegeler, Luke; Pira, Charmaine U; Mohan, Subburaman; Oberg, Kerby C

2017-06-01

Lmx1b is a homeodomain transcription factor responsible for limb dorsalization. Despite striking double-ventral (loss-of-function) and double-dorsal (gain-of-function) limb phenotypes, no direct gene targets in the limb have been confirmed. To determine direct targets, we performed a chromatin immunoprecipitation against Lmx1b in mouse limbs at embryonic day 12.5 followed by next-generation sequencing (ChIP-seq). Nearly 84% ( n =617) of the Lmx1b-bound genomic intervals (LBIs) identified overlap with chromatin regulatory marks indicative of potential cis -regulatory modules (PCRMs). In addition, 73 LBIs mapped to CRMs that are known to be active during limb development. We compared Lmx1b-bound PCRMs with genes regulated by Lmx1b and found 292 PCRMs within 1 Mb of 254 Lmx1b-regulated genes. Gene ontological analysis suggests that Lmx1b targets extracellular matrix production, bone/joint formation, axonal guidance, vascular development, cell proliferation and cell movement. We validated the functional activity of a PCRM associated with joint-related Gdf5 that provides a mechanism for Lmx1b-mediated joint modification and a PCRM associated with Lmx1b that suggests a role in autoregulation. This is the first report to describe genome-wide Lmx1b binding during limb development, directly linking Lmx1b to targets that accomplish limb dorsalization. © 2017. Published by The Company of Biologists Ltd.
A polymorphism in a conserved posttranscriptional regulatory motif alters bone morphogenetic protein 2 (BMP2) RNA:protein interactions.

PubMed

Fritz, David T; Jiang, Shan; Xu, Junwang; Rogers, Melissa B

2006-07-01

The bone morphogenetic protein (BMP)2 gene has been genetically linked to osteoporosis and osteoarthritis. We have shown that the 3'-untranslated regions (UTR) of BMP2 genes from mammals to fishes are extraordinarily conserved. This indicates that the BMP2 3'-UTR is under stringent selective pressure. We present evidence that the conserved region is a strong posttranscriptional regulator of BMP2 expression. Polymorphisms in cis-regulatory elements have been proven to influence susceptibility to a growing number of diseases. A common single nucleotide polymorphism (SNP) disrupts a putative posttranscriptional regulatory motif, an AU-rich element, within the BMP2 3'-UTR. The affinity of specific proteins for the rs15705 SNP sequence differs from their affinity for the normal human sequence. More importantly, the in vitro decay rate of RNAs with the SNP is higher than that of RNAs with the normal sequence. Such changes in mRNA:protein interactions may influence the posttranscriptional mechanisms that control BMP2 gene expression. The consequent alterations in BMP2 protein levels may influence the development or physiology of bone or other BMP2-influenced tissues.
RSAT 2015: Regulatory Sequence Analysis Tools.

PubMed

Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

2015-07-01

RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Distal regulatory regions restrict the expression of cis-linked genes to the tapetal cells.

PubMed

Franco, Luciana O; de O Manes, Carmem Lara; Hamdi, Said; Sachetto-Martins, Gilberto; de Oliveira, Dulce E

2002-04-24

The oleosin glycine-rich protein genes Atgrp-6, Atgrp-7, and Atgrp-8 occur in clusters in the Arabidopsis genome and are expressed specifically in the tapetum cells. The cis-regulatory regions involved in the tissue-specific gene expression were investigated by fusing different segments of the gene cluster to the uidA reporter gene. Common distal regulatory regions were identified that coordinate expression of the sequential genes. At least two of these genes were regulated spatially by proximal and distal sequences. The cis-acting elements (122 bp upstream of the transcriptional start point) drive the uidA expression to floral tissues, whereas distal 5' upstream regions restrict the gene activity to tapetal cells.

PhyloGibbs-MP: Module Prediction and Discriminative Motif-Finding by Gibbs Sampling

PubMed Central

Siddharthan, Rahul

2008-01-01

PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules—tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other “discriminative motif-finders” have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use “informative priors” on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data. PMID:18769735
Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT

PubMed Central

Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong

2006-01-01

Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord

PubMed Central

José-Edwards, Diana S.; Oda-Ishii, Izumi; Kugler, Jamie E.; Passamaneck, Yale J.; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

2015-01-01

A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

PubMed

José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

2015-12-01

A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.
Systematic comparison of the response properties of protein and RNA mediated gene regulatory motifs.

PubMed

Iyengar, Bharat Ravi; Pillai, Beena; Venkatesh, K V; Gadgil, Chetan J

2017-05-30

We present a framework enabling the dissection of the effects of motif structure (feedback or feedforward), the nature of the controller (RNA or protein), and the regulation mode (transcriptional, post-transcriptional or translational) on the response to a step change in the input. We have used a common model framework for gene expression where both motif structures have an activating input and repressing regulator, with the same set of parameters, to enable a comparison of the responses. We studied the global sensitivity of the system properties, such as steady-state gain, overshoot, peak time, and peak duration, to parameters. We find that, in all motifs, overshoot correlated negatively whereas peak duration varied concavely with peak time. Differences in the other system properties were found to be mainly dependent on the nature of the controller rather than the motif structure. Protein mediated motifs showed a higher degree of adaptation i.e. a tendency to return to baseline levels; in particular, feedforward motifs exhibited perfect adaptation. RNA mediated motifs had a mild regulatory effect; they also exhibited a lower peaking tendency and mean overshoot. Protein mediated feedforward motifs showed higher overshoot and lower peak time compared to the corresponding feedback motifs.
COPS: Detecting Co-Occurrence and Spatial Arrangement of Transcription Factor Binding Motifs in Genome-Wide Datasets

PubMed Central

Lohmann, Ingrid

2012-01-01

In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209
Statistics of optimal information flow in ensembles of regulatory motifs

NASA Astrophysics Data System (ADS)

Crisanti, Andrea; De Martino, Andrea; Fiorentino, Jonathan

2018-02-01

Genetic regulatory circuits universally cope with different sources of noise that limit their ability to coordinate input and output signals. In many cases, optimal regulatory performance can be thought to correspond to configurations of variables and parameters that maximize the mutual information between inputs and outputs. Since the mid-2000s, such optima have been well characterized in several biologically relevant cases. Here we use methods of statistical field theory to calculate the statistics of the maximal mutual information (the "capacity") achievable by tuning the input variable only in an ensemble of regulatory motifs, such that a single controller regulates N targets. Assuming (i) sufficiently large N , (ii) quenched random kinetic parameters, and (iii) small noise affecting the input-output channels, we can accurately reproduce numerical simulations both for the mean capacity and for the whole distribution. Our results provide insight into the inherent variability in effectiveness occurring in regulatory systems with heterogeneous kinetic parameters.
The nuclear OXPHOS genes in insecta: a common evolutionary origin, a common cis-regulatory motif, a common destiny for gene duplicates

PubMed Central

Porcelli, Damiano; Barsanti, Paolo; Pesole, Graziano; Caggese, Corrado

2007-01-01

Background When orthologous sequences from species distributed throughout an optimal range of divergence times are available, comparative genomics is a powerful tool to address problems such as the identification of the forces that shape gene structure during evolution, although the functional constraints involved may vary in different genes and lineages. Results We identified and annotated in the MitoComp2 dataset the orthologs of 68 nuclear genes controlling oxidative phosphorylation in 11 Drosophilidae species and in five non-Drosophilidae insects, and compared them with each other and with their counterparts in three vertebrates (Fugu rubripes, Danio rerio and Homo sapiens) and in the cnidarian Nematostella vectensis, taking into account conservation of gene structure and regulatory motifs, and preservation of gene paralogs in the genome. Comparative analysis indicates that the ancestral insect OXPHOS genes were intron rich and that extensive intron loss and lineage-specific intron gain occurred during evolution. Comparison with vertebrates and cnidarians also shows that many OXPHOS gene introns predate the cnidarian/Bilateria evolutionary split. The nuclear respiratory gene element (NRG) has played a key role in the evolution of the insect OXPHOS genes; it is constantly conserved in the OXPHOS orthologs of all the insect species examined, while their duplicates either completely lack the element or possess only relics of the motif. Conclusion Our observations reinforce the notion that the common ancestor of most animal phyla had intron-rich gene, and suggest that changes in the pattern of expression of the gene facilitate the fixation of duplications in the genome and the development of novel genetic functions. PMID:18315839
In silico modeling of epigenetic-induced changes in photoreceptor cis-regulatory elements.

PubMed

Hossain, Reafa A; Dunham, Nicholas R; Enke, Raymond A; Berndsen, Christopher E

2018-01-01

DNA methylation is a well-characterized epigenetic repressor of mRNA transcription in many plant and vertebrate systems. However, the mechanism of this repression is not fully understood. The process of transcription is controlled by proteins that regulate recruitment and activity of RNA polymerase by binding to specific cis-regulatory sequences. Cone-rod homeobox (CRX) is a well-characterized mammalian transcription factor that controls photoreceptor cell-specific gene expression. Although much is known about the functions and DNA binding specificity of CRX, little is known about how DNA methylation modulates CRX binding affinity to genomic cis-regulatory elements. We used bisulfite pyrosequencing of human ocular tissues to measure DNA methylation levels of the regulatory regions of RHO , PDE6B, PAX6 , and LINE1 retrotransposon repeats. To describe the molecular mechanism of repression, we used molecular modeling to illustrate the effect of DNA methylation on human RHO regulatory sequences. In this study, we demonstrate an inverse correlation between DNA methylation in regulatory regions adjacent to the human RHO and PDE6B genes and their subsequent transcription in human ocular tissues. Docking of CRX to the DNA models shows that CRX interacts with the grooves of these sequences, suggesting changes in groove structure could regulate binding. Molecular dynamics simulations of the RHO promoter and enhancer regions show changes in the flexibility and groove width upon epigenetic modification. Models also demonstrate changes in the local dynamics of CRX binding sites within RHO regulatory sequences which may account for the repression of CRX-dependent transcription. Collectively, these data demonstrate epigenetic regulation of CRX binding sites in human retinal tissue and provide insight into the mechanism of this mode of epigenetic regulation to be tested in future experiments.
Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

PubMed

Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

2015-08-15

Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan. © 2015. Published by The Company of Biologists Ltd.
Identification of N-Terminal Lobe Motifs that Determine the Kinase Activity of the Catalytic Domains and Regulatory Strategies of Src and Csk Protein Tyrosine Kinases†

PubMed Central

Huang, Kezhen; Wang, Yue-Hao; Brown, Alex; Sun, Gongqin

2009-01-01

Csk and Src protein tyrosine kinases are structurally homologous, but use opposite regulatory strategies. The isolated catalytic domain of Csk is intrinsically inactive and is activated by interactions with the regulatory SH3 and SH2 domains, while the isolated catalytic domain of Src is intrinsically active and is suppressed by interactions with the regulatory SH3 and SH2 domains. The structural basis for why one isolated catalytic domain is intrinsically active while the other is inactive is not clear. In this current study, we identify the structural elements in the N-terminal lobe of the catalytic domain that render the Src catalytic domain active. These structural elements include the α-helix C region, a β-turn between the β-4 and β-5 strands, and an Arg residue at the beginning of the catalytic domain. These three motifs interact with each other to activate the Src catalytic domain, but the equivalent motifs in Csk directly interact with the regulatory domains that are important for Csk activation. The Src motifs can be grafted to the Csk catalytic domain to obtain an active Csk catalytic domain. These results, together with available Src and Csk tertiary structures, reveal an important structural switch that determines the kinase activity of a catalytic domain and dictates the regulatory strategy of a kinase. PMID:19244618
Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts?

PubMed

Fish, Alexandra E; Capra, John A; Bush, William S

2016-10-06

The importance of epistasis-or statistical interactions between genetic variants-to the development of complex disease in humans has been controversial. Genome-wide association studies of statistical interactions influencing human traits have recently become computationally feasible and have identified many putative interactions. However, statistical models used to detect interactions can be confounded, which makes it difficult to be certain that observed statistical interactions are evidence for true molecular epistasis. In this study, we investigate whether there is evidence for epistatic interactions between genetic variants within the cis-regulatory region that influence gene expression after accounting for technical, statistical, and biological confounding factors. We identified 1,119 (FDR = 5%) interactions that appear to regulate gene expression in human lymphoblastoid cell lines, a tightly controlled, largely genetically determined phenotype. Many of these interactions replicated in an independent dataset (90 of 803 tested, Bonferroni threshold). We then performed an exhaustive analysis of both known and novel confounders, including ceiling/floor effects, missing genotype combinations, haplotype effects, single variants tagged through linkage disequilibrium, and population stratification. Every interaction could be explained by at least one of these confounders, and replication in independent datasets did not protect against some confounders. Assuming that the confounding factors provide a more parsimonious explanation for each interaction, we find it unlikely that cis-regulatory interactions contribute strongly to human gene expression, which calls into question the relevance of cis-regulatory interactions for other human phenotypes. We additionally propose several best practices for epistasis testing to protect future studies from confounding. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
G4 motifs affect origin positioning and efficiency in two vertebrate replicators

PubMed Central

Valton, Anne-Laure; Hassan-Zadeh, Vahideh; Lema, Ingrid; Boggetto, Nicole; Alberti, Patrizia; Saintomé, Carole; Riou, Jean-François; Prioleau, Marie-Noëlle

2014-01-01

DNA replication ensures the accurate duplication of the genome at each cell cycle. It begins at specific sites called replication origins. Genome-wide studies in vertebrates have recently identified a consensus G-rich motif potentially able to form G-quadruplexes (G4) in most replication origins. However, there is no experimental evidence to demonstrate that G4 are actually required for replication initiation. We show here, with two model origins, that G4 motifs are required for replication initiation. Two G4 motifs cooperate in one of our model origins. The other contains only one critical G4, and its orientation determines the precise position of the replication start site. Point mutations affecting the stability of this G4 in vitro also impair origin function. Finally, this G4 is not sufficient for origin activity and must cooperate with a 200-bp cis-regulatory element. In conclusion, our study strongly supports the predicted essential role of G4 in replication initiation. PMID:24521668
DNA motif alignment by evolving a population of Markov chains.

PubMed

Bi, Chengpeng

2009-01-30

Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.
Comparing anterior and posterior Hox complex formation reveals guidelines for predicting cis-regulatory elements

PubMed Central

Uhl, Juli D.; Cook, Tiffany A.; Gebelein, Brian

2010-01-01

Hox transcription factors specify numerous cell fates along the anterior-posterior axis by regulating the expression of downstream target genes. While expression analysis has uncovered large numbers of de-regulated genes in cells with altered Hox activity, determining which are direct versus indirect targets has remained a significant challenge. Here, we characterize the DNA binding activity of Hox transcription factor complexes on eight experimentally verified cis-regulatory elements. Hox factors regulate the activity of each element by forming protein complexes with two cofactor proteins, Extradenticle (Exd) and Homothorax (Hth). Using comparative DNA binding assays, we found that a number of flexible arrangements of Hox, Exd, and Hth binding sites mediate cooperative transcription factor complexes. Moreover, analysis of a Distal-less regulatory element (DMXR) that is repressed by abdominal Hox factors revealed that suboptimal binding sites can be combined to form high affinity transcription complexes. Lastly, we determined that the anterior Hox factors are more dependent upon Exd and Hth for complex formation than posterior Hox factors. Based upon these findings, we suggest a general set of guidelines to serve as a basis for designing bioinformatics algorithms aimed at identifying Hox regulatory elements using the wealth of recently sequenced genomes. PMID:20398649
The evolution of cichlid fish egg-spots is linked with a cis-regulatory change

PubMed Central

Santos, M. Emília; Braasch, Ingo; Boileau, Nicolas; Meyer, Britta S.; Sauteur, Loïc; Böhne, Astrid; Belting, Heinz-Georg; Affolter, Markus; Salzburger, Walter

2014-01-01

The origin of novel phenotypic characters is a key component in organismal diversification; yet, the mechanisms underlying the emergence of such evolutionary novelties are largely unknown. Here we examine the origin of egg-spots, an evolutionary innovation of the most species-rich group of cichlids, the haplochromines, where these conspicuous male fin colour markings are involved in mating. Applying a combination of RNAseq, comparative genomics and functional experiments, we identify two novel pigmentation genes, fhl2a and fhl2b, and show that especially the more rapidly evolving b-paralog is associated with egg-spot formation. We further find that egg-spot bearing haplochromines, but not other cichlids, feature a transposable element in the cis-regulatory region of fhl2b. Using transgenic zebrafish, we finally demonstrate that this region shows specific enhancer activities in iridophores, a type of pigment cells found in egg-spots, suggesting that a cis-regulatory change is causally linked to the gain of expression in egg-spot bearing haplochromines. PMID:25296686
A Catalogue of Putative cis-Regulatory Interactions Between Long Non-coding RNAs and Proximal Coding Genes Based on Correlative Analysis Across Diverse Human Tumors.

PubMed

Basu, Swaraj; Larsson, Erik

2018-05-31

Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
Shared Enhancer Activity in the Limbs and Phallus and Functional Divergence of a Limb-Genital cis-Regulatory Element in Snakes.

PubMed

Infante, Carlos R; Mihala, Alexandra G; Park, Sungdae; Wang, Jialiang S; Johnson, Kenji K; Lauderdale, James D; Menke, Douglas B

2015-10-12

The amniote phallus and limbs differ dramatically in their morphologies but share patterns of signaling and gene expression in early development. Thus far, the extent to which genital and limb transcriptional networks also share cis-regulatory elements has remained unexplored. We show that many limb enhancers are retained in snake genomes, suggesting that these elements may function in non-limb tissues. Consistent with this, our analysis of cis-regulatory activity in mice and Anolis lizards reveals that patterns of enhancer activity in embryonic limbs and genitalia overlap heavily. In mice, deletion of HLEB, an enhancer of Tbx4, produces defects in hindlimbs and genitalia, establishing the importance of this limb-genital enhancer for development of these different appendages. Further analyses demonstrate that the HLEB of snakes has lost hindlimb enhancer function while retaining genital activity. Our findings identify roles for Tbx4 in genital development and highlight deep similarities in cis-regulatory activity between limbs and genitalia. Copyright © 2015 Elsevier Inc. All rights reserved.
Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

PubMed Central

Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

2012-01-01

Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
CisSERS: Customizable in silico sequence evaluation for restriction sites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less

CisSERS: Customizable in silico sequence evaluation for restriction sites

DOE PAGES

Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus; ...

2016-04-12

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less
Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation.

PubMed

Babak, Tomas; Garrett-Engele, Philip; Armour, Christopher D; Raymond, Christopher K; Keller, Mark P; Chen, Ronghua; Rohl, Carol A; Johnson, Jason M; Attie, Alan D; Fraser, Hunter B; Schadt, Eric E

2010-08-13

Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application. Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants. Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing.
Identification of a cis-Regulatory Element Involved in Phytochrome Down-Regulated Expression of the Pea Small GTPase Gene pra21

PubMed Central

Inaba, Takehito; Nagano, Yukio; Sakakibara, Toshihiro; Sasaki, Yukiko

1999-01-01

The pra2 gene encodes a pea (Pisum sativum) small GTPase belonging to the YPT/rab family, and its expression is down-regulated by light, mediated by phytochrome. We have isolated and characterized a genomic clone of this gene and constructed a fusion DNA of its 5′-upstream region in front of the gene for firefly luciferase. Using this construct in a transient assay, we determined a pra2 cis-regulatory region sufficient to direct the light down-regulation of the luciferase reporter gene. Both 5′- and internal deletion analyses revealed that the 93-bp sequence between −734 and −642 from the transcriptional start site was important for phytochrome down-regulation. Gain-of-function analysis showed that this 93-bp region could confer light down-regulation when fused to the cauliflower mosaic virus 35S promoter. Furthermore, linker-scanning analysis showed that a 12-bp sequence within the 93-bp region mediated phytochrome down-regulation. Gel-retardation analysis showed the presence of a nuclear factor that was specifically bound to the 12-bp sequence in vitro. These results indicate that this element is a cis-regulatory element involved in phytochrome down-regulated expression. PMID:10364400
Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation

PubMed Central

2010-01-01

Background Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application. Results Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants. Conclusion Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing. PMID:20707912
Rapid evolution of cis-regulatory sequences via local point mutations

NASA Technical Reports Server (NTRS)

Stone, J. R.; Wray, G. A.

2001-01-01

Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.
Upstream mononucleotide A-repeats play a cis-regulatory role in mammals through the DICER1 and Ago proteins.

PubMed

Aporntewan, Chatchawit; Pin-on, Piyapat; Chaiyaratana, Nachol; Pongpanich, Monnat; Boonyaratanakornkit, Viroj; Mutirangura, Apiwat

2013-10-01

A-repeats are the simplest form of tandem repeats and are found ubiquitously throughout genomes. These mononucleotide repeats have been widely believed to be non-functional 'junk' DNA. However, studies in yeasts suggest that A-repeats play crucial biological functions, and their role in humans remains largely unknown. Here, we showed a non-random pattern of distribution of sense A- and T-repeats within 20 kb around transcription start sites (TSSs) in the human genome. Different distributions of these repeats are observed upstream and downstream of TSSs. Sense A-repeats are enriched upstream, whereas sense T-repeats are enriched downstream of TSSs. This enrichment directly correlates with repeat size. Genes with different functions contain different lengths of repeats. In humans, tissue-specific genes are enriched for short repeats of <10 bp, whereas housekeeping genes are enriched for long repeats of ≥10 bp. We demonstrated that DICER1 and Argonaute proteins are required for the cis-regulatory role of A-repeats. Moreover, in the presence of a synthetic polymer that mimics an A-repeat, protein binding to A-repeats was blocked, resulting in a dramatic change in the expression of genes containing upstream A-repeats. Our findings suggest a length-dependent cis-regulatory function of A-repeats and that Argonaute proteins serve as trans-acting factors, binding to A-repeats.
Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression

PubMed Central

Dunipace, Leslie; Ozdemir, Anil; Stathopoulos, Angelike

2011-01-01

It has been shown in several organisms that multiple cis-regulatory modules (CRMs) of a gene locus can be active concurrently to support similar spatiotemporal expression. To understand the functional importance of such seemingly redundant CRMs, we examined two CRMs from the Drosophila snail gene locus, which are both active in the ventral region of pre-gastrulation embryos. By performing a deletion series in a ∼25 kb DNA rescue construct using BAC recombineering and site-directed transgenesis, we demonstrate that the two CRMs are not redundant. The distal CRM is absolutely required for viability, whereas the proximal CRM is required only under extreme conditions such as high temperature. Consistent with their distinct requirements, the CRMs support distinct expression patterns: the proximal CRM exhibits an expanded expression domain relative to endogenous snail, whereas the distal CRM exhibits almost complete overlap with snail except at the anterior-most pole. We further show that the distal CRM normally limits the increased expression domain of the proximal CRM and that the proximal CRM serves as a `damper' for the expression levels driven by the distal CRM. Thus, the two CRMs interact in cis in a non-additive fashion and these interactions may be important for fine-tuning the domains and levels of gene expression. PMID:21813571
Massively parallel cis-regulatory analysis in the mammalian central nervous system

PubMed Central

Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.

2016-01-01

Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614
KIRMES: kernel-based identification of regulatory modules in euchromatic sequences.

PubMed

Schultheiss, Sebastian J; Busch, Wolfgang; Lohmann, Jan U; Kohlbacher, Oliver; Rätsch, Gunnar

2009-08-15

Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules. We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets. The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at http://www.fml.mpg.de/raetsch/suppl/kirmes/.
DNaseI Hypersensitivity and Ultraconservation Reveal Novel, Interdependent Long-Range Enhancers at the Complex Pax6 Cis-Regulatory Region

PubMed Central

McBride, David J.; Buckle, Adam; van Heyningen, Veronica; Kleinjan, Dirk A.

2011-01-01

The PAX6 gene plays a crucial role in development of the eye, brain, olfactory system and endocrine pancreas. Consistent with its pleiotropic role the gene exhibits a complex developmental expression pattern which is subject to strict spatial, temporal and quantitative regulation. Control of expression depends on a large array of cis-elements residing in an extended genomic domain around the coding region of the gene. The minimal essential region required for proper regulation of this complex locus has been defined through analysis of human aniridia-associated breakpoints and YAC transgenic rescue studies of the mouse smalleye mutant. We have carried out a systematic DNase I hypersensitive site (HS) analysis across 200 kb of this critical region of mouse chromosome 2E3 to identify putative regulatory elements. Mapping the identified HSs onto a percent identity plot (PIP) shows many HSs correspond to recognisable genomic features such as evolutionarily conserved sequences, CpG islands and retrotransposon derived repeats. We then focussed on a region previously shown to contain essential long range cis-regulatory information, the Pax6 downstream regulatory region (DRR), allowing comparison of mouse HS data with previous human HS data for this region. Reporter transgenic mice for two of the HS sites, HS5 and HS6, show that they function as tissue specific regulatory elements. In addition we have characterised enhancer activity of an ultra-conserved cis-regulatory region located near Pax6, termed E60. All three cis-elements exhibit multiple spatio-temporal activities in the embryo that overlap between themselves and other elements in the locus. Using a deletion set of YAC reporter transgenic mice we demonstrate functional interdependence of the elements. Finally, we use the HS6 enhancer as a marker for the migration of precerebellar neuro-epithelium cells to the hindbrain precerebellar nuclei along the posterior and anterior extramural streams allowing visualisation of
Do motifs reflect evolved function?--No convergent evolution of genetic regulatory network subgraph topologies.

PubMed

Knabe, Johannes F; Nehaniv, Chrystopher L; Schilstra, Maria J

2008-01-01

Methods that analyse the topological structure of networks have recently become quite popular. Whether motifs (subgraph patterns that occur more often than in randomized networks) have specific functions as elementary computational circuits has been cause for debate. As the question is difficult to resolve with currently available biological data, we approach the issue using networks that abstractly model natural genetic regulatory networks (GRNs) which are evolved to show dynamical behaviors. Specifically one group of networks was evolved to be capable of exhibiting two different behaviors ("differentiation") in contrast to a group with a single target behavior. In both groups we find motif distribution differences within the groups to be larger than differences between them, indicating that evolutionary niches (target functions) do not necessarily mold network structure uniquely. These results show that variability operators can have a stronger influence on network topologies than selection pressures, especially when many topologies can create similar dynamics. Moreover, analysis of motif functional relevance by lesioning did not suggest that motifs were of greater importance to the functioning of the network than arbitrary subgraph patterns. Only when drastically restricting network size, so that one motif corresponds to a whole functionally evolved network, was preference for particular connection patterns found. This suggests that in non-restricted, bigger networks, entanglement with the rest of the network hinders topological subgraph analysis.
Homeostasis in a feed forward loop gene regulatory motif.

PubMed

Antoneli, Fernando; Golubitsky, Martin; Stewart, Ian

2018-05-14

The internal state of a cell is affected by inputs from the extra-cellular environment such as external temperature. If some output, such as the concentration of a target protein, remains approximately constant as inputs vary, the system exhibits homeostasis. Special sub-networks called motifs are unusually common in gene regulatory networks (GRNs), suggesting that they may have a significant biological function. Potentially, one such function is homeostasis. In support of this hypothesis, we show that the feed-forward loop GRN produces homeostasis. Here the inputs are subsumed into a single parameter that affects only the first node in the motif, and the output is the concentration of a target protein. The analysis uses the notion of infinitesimal homeostasis, which occurs when the input-output map has a critical point (zero derivative). In model equations such points can be located using implicit differentiation. If the second derivative of the input-output map also vanishes, the critical point is a chair: the output rises roughly linearly, then flattens out (the homeostasis region or plateau), and then starts to rise again. Chair points are a common cause of homeostasis. In more complicated equations or networks, numerical exploration would have to augment analysis. Thus, in terms of finding chairs, this paper presents a proof of concept. We apply this method to a standard family of differential equations modeling the feed-forward loop GRN, and deduce that chair points occur. This function determines the production of a particular mRNA and the resulting chair points are found analytically. The same method can potentially be used to find homeostasis regions in other GRNs. In the discussion and conclusion section, we also discuss why homeostasis in the motif may persist even when the rest of the network is taken into account. Copyright © 2018 Elsevier Ltd. All rights reserved.
Searching for statistically significant regulatory modules.

PubMed

Bailey, Timothy L; Noble, William Stafford

2003-10-01

The regulatory machinery controlling gene expression is complex, frequently requiring multiple, simultaneous DNA-protein interactions. The rate at which a gene is transcribed may depend upon the presence or absence of a collection of transcription factors bound to the DNA near the gene. Locating transcription factor binding sites in genomic DNA is difficult because the individual sites are small and tend to occur frequently by chance. True binding sites may be identified by their tendency to occur in clusters, sometimes known as regulatory modules. We describe an algorithm for detecting occurrences of regulatory modules in genomic DNA. The algorithm, called mcast, takes as input a DNA database and a collection of binding site motifs that are known to operate in concert. mcast uses a motif-based hidden Markov model with several novel features. The model incorporates motif-specific p-values, thereby allowing scores from motifs of different widths and specificities to be compared directly. The p-value scoring also allows mcast to only accept motif occurrences with significance below a user-specified threshold, while still assigning better scores to motif occurrences with lower p-values. mcast can search long DNA sequences, modeling length distributions between motifs within a regulatory module, but ignoring length distributions between modules. The algorithm produces a list of predicted regulatory modules, ranked by E-value. We validate the algorithm using simulated data as well as real data sets from fruitfly and human. http://meme.sdsc.edu/MCAST/paper
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules

PubMed Central

Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.

PubMed

Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Multiple cis-regulatory elements are involved in the complex regulation of the sieve element-specific MtSEO-F1 promoter from Medicago truncatula.

PubMed

Bucsenez, M; Rüping, B; Behrens, S; Twyman, R M; Noll, G A; Prüfer, D

2012-09-01

The sieve element occlusion (SEO) gene family includes several members that are expressed specifically in immature sieve elements (SEs) in the developing phloem of dicotyledonous plants. To determine how this restricted expression profile is achieved, we analysed the SE-specific Medicago truncatula SEO-F1 promoter (PMtSEO-F1) by constructing deletion, substitution and hybrid constructs and testing them in transgenic tobacco plants using green fluorescent protein as a reporter. This revealed four promoter regions, each containing cis-regulatory elements that activate transcription in SEs. One of these segments also contained sufficient information to suppress PMtSEO-F1 transcription in the phloem companion cells (CCs). Subsequent in silico analysis revealed several candidate cis-regulatory elements that PMtSEO-F1 shares with other SEO promoters. These putative sieve element boxes (PSE boxes) are promising candidates for cis-regulatory elements controlling the SE-specific expression of PMtSEO-F1. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.
A genomic regulatory network for development

NASA Technical Reports Server (NTRS)

Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar;

2002-01-01

Development of the body plan is controlled by large networks of regulatory genes. A gene regulatory network that controls the specification of endoderm and mesoderm in the sea urchin embryo is summarized here. The network was derived from large-scale perturbation analyses, in combination with computational methodologies, genomic data, cis-regulatory analysis, and molecular embryology. The network contains over 40 genes at present, and each node can be directly verified at the DNA sequence level by cis-regulatory analysis. Its architecture reveals specific and general aspects of development, such as how given cells generate their ordained fates in the embryo and why the process moves inexorably forward in developmental time.

Structural basis for genome wide recognition of 5-bp GC motifs by SMAD transcription factors.

PubMed

Martin-Malpartida, Pau; Batet, Marta; Kaczmarska, Zuzanna; Freier, Regina; Gomes, Tiago; Aragón, Eric; Zou, Yilong; Wang, Qiong; Xi, Qiaoran; Ruiz, Lidia; Vea, Angela; Márquez, José A; Massagué, Joan; Macias, Maria J

2017-12-12

Smad transcription factors activated by TGF-β or by BMP receptors form trimeric complexes with Smad4 to target specific genes for cell fate regulation. The CAGAC motif has been considered as the main binding element for Smad2/3/4, whereas Smad1/5/8 have been thought to preferentially bind GC-rich elements. However, chromatin immunoprecipitation analysis in embryonic stem cells showed extensive binding of Smad2/3/4 to GC-rich cis-regulatory elements. Here, we present the structural basis for specific binding of Smad3 and Smad4 to GC-rich motifs in the goosecoid promoter, a nodal-regulated differentiation gene. The structures revealed a 5-bp consensus sequence GGC(GC)|(CG) as the binding site for both TGF-β and BMP-activated Smads and for Smad4. These 5GC motifs are highly represented as clusters in Smad-bound regions genome-wide. Our results provide a basis for understanding the functional adaptability of Smads in different cellular contexts, and their dependence on lineage-determining transcription factors to target specific genes in TGF-β and BMP pathways.
Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

PubMed

Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

2005-07-15

A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.
Motif enrichment tool.

PubMed

Blatti, Charles; Sinha, Saurabh

2014-07-01

The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Genetic mapping uncovers cis-regulatory landscape of RNA editing.

PubMed

Ramaswami, Gokul; Deng, Patricia; Zhang, Rui; Anna Carbone, Mary; Mackay, Trudy F C; Li, Jin Billy

2015-09-16

Adenosine-to-inosine (A-to-I) RNA editing, catalysed by ADAR enzymes conserved in metazoans, plays an important role in neurological functions. Although the fine-tuning mechanism provided by A-to-I RNA editing is important, the underlying rules governing ADAR substrate recognition are not well understood. We apply a quantitative trait loci (QTL) mapping approach to identify genetic variants associated with variability in RNA editing. With very accurate measurement of RNA editing levels at 789 sites in 131 Drosophila melanogaster strains, here we identify 545 editing QTLs (edQTLs) associated with differences in RNA editing. We demonstrate that many edQTLs can act through changes in the local secondary structure for edited dsRNAs. Furthermore, we find that edQTLs located outside of the edited dsRNA duplex are enriched in secondary structure, suggesting that distal dsRNA structure beyond the editing site duplex affects RNA editing efficiency. Our work will facilitate the understanding of the cis-regulatory code of RNA editing.
Spectroscopic studies on peptides and proteins with cysteine-containing heme regulatory motifs (HRM).

PubMed

Schubert, Erik; Florin, Nicole; Duthie, Fraser; Henning Brewitz, H; Kühl, Toni; Imhof, Diana; Hagelueken, Gregor; Schiemann, Olav

2015-07-01

The role of heme as a cofactor in enzymatic reactions has been studied for a long time and in great detail. Recently it was discovered that heme can also serve as a signalling molecule in cells but so far only few examples of this regulation have been studied. In order to discover new potentially heme-regulated proteins, we screened protein sequence databases for bacterial proteins that contain sequence features like a Cysteine-Proline (CP) motif, which is known for its heme-binding propensity. Based on this search we synthesized a series of these potential heme regulatory motifs (HRMs). We used cw EPR spectroscopy to investigate whether these sequences do indeed bind to heme and if the spin state of heme is changed upon interaction with the peptides. The corresponding proteins of two potential HRMs, FeoB and GlpF, were expressed and purified and their interaction with heme was studied by cw EPR and UV-Visible (UV-Vis) spectroscopy. Copyright © 2015 Elsevier Inc. All rights reserved.
Motif discovery and motif finding from genome-mapped DNase footprint data.

PubMed

Kulakovskiy, Ivan V; Favorov, Alexander V; Makeev, Vsevolod J

2009-09-15

Footprint data is an important source of information on transcription factor recognition motifs. However, a footprinting fragment can contain no sequences similar to known protein recognition sites. Inspection of genome fragments nearby can help to identify missing site positions. Genome fragments containing footprints were supplied to a pipeline that constructed a position weight matrix (PWM) for different motif lengths and selected the optimal PWM. Fragments were aligned with the SeSiMCMC sampler and a new heuristic algorithm, Bigfoot. Footprints with missing hits were found for approximately 50% of factors. Adding only 2 bp on both sides of a footprinting fragment recovered most hits. We automatically constructed motifs for 41 Drosophila factors. New motifs can recognize footprints with a greater sensitivity at the same false positive rate than existing models. Also we discuss possible overfitting of constructed motifs. Software and the collection of regulatory motifs are freely available at http://line.imb.ac.ru/DMMPMM.
De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

PubMed

Zolotarov, Yevgen; Strömvik, Martina

2015-01-01

Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints

PubMed Central

2012-01-01

Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage
Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights

PubMed Central

2011-01-01

Background Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles. PMID:22189060
Prenatal Exposure of Mice to Diethylstilbestrol Disrupts T-Cell Differentiation by Regulating Fas/Fas Ligand Expression through Estrogen Receptor Element and Nuclear Factor-κB Motifs

PubMed Central

Singh, Narendra P.; Singh, Udai P.; Nagarkatti, Prakash S.

2012-01-01

Prenatal exposure to diethylstilbestrol (DES) is known to cause altered immune functions and increased susceptibility to autoimmune disease in humans. In the current study, we investigated the effect of prenatal exposure to DES on thymocyte differentiation involving apoptotic pathways. Prenatal DES exposure caused thymic atrophy, apoptosis, and up-regulation of Fas and Fas ligand (FasL) expression in thymocytes. To examine the mechanism underlying DES-mediated regulation of Fas and FasL, we performed luciferase assays using T cells transfected with luciferase reporter constructs containing full-length Fas or FasL promoters. There was significant luciferase induction in the presence of Fas or FasL promoters after DES exposure. Further analysis demonstrated the presence of several cis-regulatory motifs on both Fas and FasL promoters. When DES-induced transcription factors were analyzed, estrogen receptor element (ERE), nuclear factor κB (NF-κB), nuclear factor of activated T cells (NF-AT), and activator protein-1 motifs on the Fas promoter, as well as ERE, NF-κB, and NF-AT motifs on the FasL promoter, showed binding affinity with the transcription factors. Electrophoretic mobility-shift assays were performed to verify the binding affinity of cis-regulatory motifs of Fas or FasL promoters with transcription factors. There was shift in mobility of probes (ERE or NF-κB2) of both Fas and FasL in the presence of nuclear proteins from DES-treated cells, and the shift was specific to DES because these probes failed to shift their mobility in the presence of nuclear proteins from vehicle-treated cells. Together, the current study demonstrates that prenatal exposure to DES triggers significant alterations in apoptotic molecules expressed on thymocytes, which may affect T-cell differentiation and cause long-term effects on the immune functions. PMID:22888145
Prenatal exposure of mice to diethylstilbestrol disrupts T-cell differentiation by regulating Fas/Fas ligand expression through estrogen receptor element and nuclear factor-κB motifs.

PubMed

Singh, Narendra P; Singh, Udai P; Nagarkatti, Prakash S; Nagarkatti, Mitzi

2012-11-01

Prenatal exposure to diethylstilbestrol (DES) is known to cause altered immune functions and increased susceptibility to autoimmune disease in humans. In the current study, we investigated the effect of prenatal exposure to DES on thymocyte differentiation involving apoptotic pathways. Prenatal DES exposure caused thymic atrophy, apoptosis, and up-regulation of Fas and Fas ligand (FasL) expression in thymocytes. To examine the mechanism underlying DES-mediated regulation of Fas and FasL, we performed luciferase assays using T cells transfected with luciferase reporter constructs containing full-length Fas or FasL promoters. There was significant luciferase induction in the presence of Fas or FasL promoters after DES exposure. Further analysis demonstrated the presence of several cis-regulatory motifs on both Fas and FasL promoters. When DES-induced transcription factors were analyzed, estrogen receptor element (ERE), nuclear factor κB (NF-κB), nuclear factor of activated T cells (NF-AT), and activator protein-1 motifs on the Fas promoter, as well as ERE, NF-κB, and NF-AT motifs on the FasL promoter, showed binding affinity with the transcription factors. Electrophoretic mobility-shift assays were performed to verify the binding affinity of cis-regulatory motifs of Fas or FasL promoters with transcription factors. There was shift in mobility of probes (ERE or NF-κB2) of both Fas and FasL in the presence of nuclear proteins from DES-treated cells, and the shift was specific to DES because these probes failed to shift their mobility in the presence of nuclear proteins from vehicle-treated cells. Together, the current study demonstrates that prenatal exposure to DES triggers significant alterations in apoptotic molecules expressed on thymocytes, which may affect T-cell differentiation and cause long-term effects on the immune functions.
An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance.

PubMed

Casimiro, Ana C; Vinga, Susana; Freitas, Ana T; Oliveira, Arlindo L

2008-02-07

Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.
BET Bromodomain Inhibition Releases the Mediator Complex from Select cis-Regulatory Elements.

PubMed

Bhagwat, Anand S; Roe, Jae-Seok; Mok, Beverly Y L; Hohmann, Anja F; Shi, Junwei; Vakoc, Christopher R

2016-04-19

The bromodomain and extraterminal (BET) protein BRD4 can physically interact with the Mediator complex, but the relevance of this association to the therapeutic effects of BET inhibitors in cancer is unclear. Here, we show that BET inhibition causes a rapid release of Mediator from a subset of cis-regulatory elements in the genome of acute myeloid leukemia (AML) cells. These sites of Mediator eviction were highly correlated with transcriptional suppression of neighboring genes, which are enriched for targets of the transcription factor MYB and for functions related to leukemogenesis. A shRNA screen of Mediator in AML cells identified the MED12, MED13, MED23, and MED24 subunits as performing a similar regulatory function to BRD4 in this context, including a shared role in sustaining a block in myeloid maturation. These findings suggest that the interaction between BRD4 and Mediator has functional importance for gene-specific transcriptional activation and for AML maintenance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Identification of cis-elements and evaluation of upstream regulatory region of a rice anther-specific gene, OSIPP3, conferring pollen-specific expression in Oryza sativa (L.) ssp. indica.

PubMed

Manimaran, P; Raghurami Reddy, M; Bhaskar Rao, T; Mangrauthia, Satendra K; Sundaram, R M; Balachandran, S M

2015-12-01

Pollen-specific expression. Promoters comprise of various cis-regulatory elements which control development and physiology of plants by regulating gene expression. To understand the promoter specificity and also identification of functional cis-acting elements, progressive 5' deletion analysis of the promoter fragments is widely used. We have evaluated the activity of regulatory elements of 5' promoter deletion sequences of anther-specific gene OSIPP3, viz. OSIPP3-∆1 (1504 bp), OSIPP3-∆2 (968 bp), OSIPP3-∆3 (388 bp) and OSIPP3-∆4 (286 bp) through the expression of transgene GUS in rice. In silico analysis of 1504-bp sequence harboring different copy number of cis-acting regulatory elements such as POLLENLELAT52, GTGANTG10, enhancer element of LAT52 and LAT56 indicated that they were essential for high level of expression in pollen. Histochemical GUS analysis of the transgenic plants revealed that 1504- and 968-bp fragments directed GUS expression in roots and anthers, while the 388- and 286-bp fragments restricted the GUS expression to only pollen, of which 388 bp conferred strong GUS expression. Further, GUS staining analysis of different panicle development stages (P1-P6) confirmed that the GUS gene was preferentially expressed only at P6 stage (late pollen stage). The qRT-PCR analysis of GUS transcript revealed 23-fold higher expression of GUS transcript in OSIPP3-Δ1 followed by OSIPP3-Δ2 (eightfold) and OSIPP3-Δ3 (threefold) when compared to OSIPP3-Δ4. Based on our results, we proposed that among the two smaller fragments, the 388-bp upstream regulatory region could be considered as a promising candidate for pollen-specific expression of agronomically important transgenes in rice.
The G-Box Transcriptional Regulatory Code in Arabidopsis1[OPEN

PubMed Central

Shepherd, Samuel J.K.; Brestovitsky, Anna; Dickinson, Patrick; Biswas, Surojit

2017-01-01

Plants have significantly more transcription factor (TF) families than animals and fungi, and plant TF families tend to contain more genes; these expansions are linked to adaptation to environmental stressors. Many TF family members bind to similar or identical sequence motifs, such as G-boxes (CACGTG), so it is difficult to predict regulatory relationships. We determined that the flanking sequences near G-boxes help determine in vitro specificity but that this is insufficient to predict the transcription pattern of genes near G-boxes. Therefore, we constructed a gene regulatory network that identifies the set of bZIPs and bHLHs that are most predictive of the expression of genes downstream of perfect G-boxes. This network accurately predicts transcriptional patterns and reconstructs known regulatory subnetworks. Finally, we present Ara-BOX-cis (araboxcis.org), a Web site that provides interactive visualizations of the G-box regulatory network, a useful resource for generating predictions for gene regulatory relations. PMID:28864470
[Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

PubMed

Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

2013-10-01

Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.
Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking

PubMed Central

Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.

2013-01-01

The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088
Multiple dileucine-like motifs direct VGLUT1 trafficking.

PubMed

Foss, Sarah M; Li, Haiyan; Santos, Magda S; Edwards, Robert H; Voglmaier, Susan M

2013-06-26

The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation.
I-motif DNA structures are formed in the nuclei of human cells

NASA Astrophysics Data System (ADS)

Zeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, Daniel

2018-06-01

Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.
Characterization of Cer-1 cis-regulatory region during early Xenopus development.

PubMed

Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

2011-05-01

Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
The effect of orthology and coregulation on detecting regulatory motifs.

PubMed

Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

2010-02-03

Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE.
Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

PubMed

Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

2010-01-15

Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.
Regulatory logic of pan-neuronal gene expression in C. elegans

PubMed Central

Stefanakis, Nikolaos; Carrera, Ines; Hobert, Oliver

2015-01-01

While neuronal cell types display an astounding degree of phenotypic diversity, most if not all neuron types share a core panel of terminal features. However, little is known about how pan-neuronal expression patterns are genetically programmed. Through an extensive analysis of the cis-regulatory control regions of a battery of pan-neuronal C.elegans genes, including genes involved in synaptic vesicle biology and neuropeptide signaling, we define a common organizational principle in the regulation of pan-neuronal genes in the form of a surprisingly complex array of seemingly redundant, parallel-acting cis-regulatory modules that direct expression to broad, overlapping domains throughout the nervous system. These parallel-acting cis-regulatory modules are responsive to a multitude of distinct trans-acting factors. Neuronal gene expression programs therefore fall into two fundamentally distinct classes. Neuron type-specific genes are generally controlled by discrete and non-redundantly acting regulatory inputs, while pan-neuronal gene expression is controlled by diverse, coincident and seemingly redundant regulatory inputs. PMID:26291158

Structural characterization and regulatory element analysis of the heart isoform of cytochrome c oxidase VIa

NASA Technical Reports Server (NTRS)

Wan, B.; Moreadith, R. W.; Blomqvist, C. G. (Principal Investigator)

1995-01-01

In order to investigate the mechanism(s) governing the striated muscle-specific expression of cytochrome c oxidase VIaH we have characterized the murine gene and analyzed its transcriptional regulatory elements in skeletal myogenic cell lines. The gene is single copy, spans 689 base pairs (bp), and is comprised of three exons. The 5'-ends of transcripts from the gene are heterogeneous, but the most abundant transcript includes a 5'-untranslated region of 30 nucleotides. When fused to the luciferase reporter gene, the 3.5-kilobase 5'-flanking region of the gene directed the expression of the heterologous protein selectively in differentiated Sol8 cells and transgenic mice, recapitulating the pattern of expression of the endogenous gene. Deletion analysis identified a 300-bp fragment sufficient to direct the myotube-specific expression of luciferase in Sol8 cells. The region lacks an apparent TATA element, and sequence motifs predicted to bind NRF-1, NRF-2, ox-box, or PPAR factors known to regulate other nuclear genes encoding mitochondrial proteins are not evident. Mutational analysis, however, identified two cis-elements necessary for the high level expression of the reporter protein: a MEF2 consensus element at -90 to -81 bp and an E-box element at -147 to -142 bp. Additional E-box motifs at closely located positions were mutated without loss of transcriptional activity. The dependence of transcriptional activation of cytochrome c oxidase VIaH on cis-elements similar to those found in contractile protein genes suggests that the striated muscle-specific expression is coregulated by mechanisms that control the lineage-specific expression of several contractile and cytosolic proteins.
A HLA class I cis-regulatory element whose activity can be modulated by hormones.

PubMed

Sim, B C; Hui, K M

1994-12-01

To elucidate the basis of the down-regulation in major histocompatibility complex (MHC) class I gene expression and to identify possible DNA-binding regulatory elements that have the potential to interact with class I MHC genes, we have studied the transcriptional regulation of class I HLA genes in human breast carcinoma cells. A 9 base pair (bp) negative cis-regulatory element (NRE) has been identified using band-shift assays employing DNA sequences derived from the 5'-flanking region of HLA class I genes. This 9-bp element, GTCATGGCG, located within exon I of the HLA class I gene, can potently inhibit the expression of a heterologous thymidine kinase (TK) gene promoter and the HLA enhancer element. Furthermore, this regulatory element can exert its suppressive function in either the sense or anti-sense orientation. More interestingly, NRE can suppress dexamethasone-mediated gene activation in the context of the reported glucocorticoid-responsive element (GRE) in MCF-7 cells but has no influence on the estrogen-mediated transcriptional activation of MCF-7 cells in the context of the reported estrogen-responsive element (ERE). Furthermore, the presence of such a regulatory element within the HLA class I gene whose activity can be modulated by hormones correlates well with our observation that the level of HLA class I gene expression can be down-regulated by hormones in human breast carcinoma cells. Such interactions between negative regulatory elements and specific hormone trans-activators are novel and suggest a versatile form of transcriptional control.
The Effect of Orthology and Coregulation on Detecting Regulatory Motifs

PubMed Central

Storms, Valerie; Claeys, Marleen; Sanchez, Aminael; De Moor, Bart; Verstuyf, Annemieke; Marchal, Kathleen

2010-01-01

Background Computational de novo discovery of transcription factor binding sites is still a challenging problem. The growing number of sequenced genomes allows integrating orthology evidence with coregulation information when searching for motifs. Moreover, the more advanced motif detection algorithms explicitly model the phylogenetic relatedness between the orthologous input sequences and thus should be well adapted towards using orthologous information. In this study, we evaluated the conditions under which complementing coregulation with orthologous information improves motif detection for the class of probabilistic motif detection algorithms with an explicit evolutionary model. Methodology We designed datasets (real and synthetic) covering different degrees of coregulation and orthologous information to test how well Phylogibbs and Phylogenetic sampler, as representatives of the motif detection algorithms with evolutionary model performed as compared to MEME, a more classical motif detection algorithm that treats orthologs independently. Results and Conclusions Under certain conditions detecting motifs in the combined coregulation-orthology space is indeed more efficient than using each space separately, but this is not always the case. Moreover, the difference in success rate between the advanced algorithms and MEME is still marginal. The success rate of motif detection depends on the complex interplay between the added information and the specificities of the applied algorithms. Insights in this relation provide information useful to both developers and users. All benchmark datasets are available at http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Storms_Valerie_PlosONE. PMID:20140085
Cis-regulatory underpinnings of human GLI3 expression in embryonic craniofacial structures and internal organs.

PubMed

Abbasi, Amir A; Minhas, Rashid; Schmidt, Ansgar; Koch, Sabine; Grzeschik, Karl-Heinz

2013-10-01

The zinc finger transcription factor Gli3 is an important mediator of Sonic hedgehog (Shh) signaling. During early embryonic development Gli3 participates in patterning and growth of the central nervous system, face, skeleton, limb, tooth and gut. Precise regulation of the temporal and spatial expression of Gli3 is crucial for the proper specification of these structures in mammals and other vertebrates. Previously we reported a set of human intronic cis-regulators controlling almost the entire known repertoire of endogenous Gli3 expression in mouse neural tube and limbs. However, the genetic underpinning of GLI3 expression in other embryonic domains such as craniofacial structures and internal organs remain elusive. Here we demonstrate in a transgenic mice assay the potential of a subset of human/fish conserved non-coding sequences (CNEs) residing within GLI3 intronic intervals to induce reporter gene expression at known regions of endogenous Gli3 transcription in embryonic domains other than central nervous system (CNS) and limbs. Highly specific reporter expression was observed in craniofacial structures, eye, gut, and genitourinary system. Moreover, the comparison of expression patterns directed by these intronic cis-acting regulatory elements in mouse and zebrafish embryos suggests that in accordance with sequence conservation, the target site specificity of a subset of these elements remains preserved among these two lineages. Taken together with our recent investigations, it is proposed here that during vertebrate evolution the Gli3 expression control acquired multiple, independently acting, intronic enhancers for spatiotemporal patterning of CNS, limbs, craniofacial structures and internal organs. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.
Occupancy of tissue-specific cis-regulatory modules by Otx2 and TLE/Groucho for embryonic head specification.

PubMed

Yasuoka, Yuuri; Suzuki, Yutaka; Takahashi, Shuji; Someya, Haruka; Sudou, Norihiro; Haramoto, Yoshikazu; Cho, Ken W; Asashima, Makoto; Sugano, Sumio; Taira, Masanori

2014-07-09

Head specification by the head-selector gene, orthodenticle (otx), is highly conserved among bilaterian lineages. However, the molecular mechanisms by which Otx and other transcription factors (TFs) interact with the genome to direct head formation are largely unknown. Here we employ ChIP-seq and RNA-seq approaches in Xenopus tropicalis gastrulae and find that occupancy of the corepressor, TLE/Groucho, is a better indicator of tissue-specific cis-regulatory modules (CRMs) than the coactivator p300, during early embryonic stages. On the basis of TLE binding and comprehensive CRM profiling, we define two distinct types of Otx2- and TLE-occupied CRMs. Using these devices, Otx2 and other head organizer TFs (for example, Lim1/Lhx1 (activator) or Goosecoid (repressor)) are able to upregulate or downregulate a large battery of target genes in the head organizer. An underlying principle is that Otx marks target genes for head specification to be regulated positively or negatively by partner TFs through specific types of CRMs.
Limitations and potentials of current motif discovery algorithms

PubMed Central

Hu, Jianjun; Li, Bin; Kihara, Daisuke

2005-01-01

Computational methods for de novo identification of gene regulation elements, such as transcription factor binding sites, have proved to be useful for deciphering genetic regulatory networks. However, despite the availability of a large number of algorithms, their strengths and weaknesses are not sufficiently understood. Here, we designed a comprehensive set of performance measures and benchmarked five modern sequence-based motif discovery algorithms using large datasets generated from Escherichia coli RegulonDB. Factors that affect the prediction accuracy, scalability and reliability are characterized. It is revealed that the nucleotide and the binding site level accuracy are very low, while the motif level accuracy is relatively high, which indicates that the algorithms can usually capture at least one correct motif in an input sequence. To exploit diverse predictions from multiple runs of one or more algorithms, a consensus ensemble algorithm has been developed, which achieved 6–45% improvement over the base algorithms by increasing both the sensitivity and specificity. Our study illustrates limitations and potentials of existing sequence-based motif discovery algorithms. Taking advantage of the revealed potentials, several promising directions for further improvements are discussed. Since the sequence-based algorithms are the baseline of most of the modern motif discovery algorithms, this paper suggests substantial improvements would be possible for them. PMID:16284194
The Caenorhabditis elegans vulva: A post-embryonic gene regulatory network controlling organogenesis

PubMed Central

Ririe, Ted O.; Fernandes, Jolene S.; Sternberg, Paul W.

2008-01-01

The Caenorhabditis elegans vulva is an elegant model for dissecting a gene regulatory network (GRN) that directs postembryonic organogenesis. The mature vulva comprises seven cell types (vulA, vulB1, vulB2, vulC, vulD, vulE, and vulF), each with its own unique pattern of spatial and temporal gene expression. The mechanisms that specify these cell types in a precise spatial pattern are not well understood. Using reverse genetic screens, we identified novel components of the vulval GRN, including nhr-113 in vulA. Several transcription factors (lin-11, lin-29, cog-1, egl-38, and nhr-67) interact with each other and act in concert to regulate target gene expression in the diverse vulval cell types. For example, egl-38 (Pax2/5/8) stabilizes the vulF fate by positively regulating vulF characteristics and by inhibiting characteristics associated with the neighboring vulE cells. nhr-67 and egl-38 regulate cog-1, helping restrict its expression to vulE. Computational approaches have been successfully used to identify functional cis-regulatory motifs in the zmp-1 (zinc metalloproteinase) promoter. These results provide an overview of the regulatory network architecture for each vulval cell type. PMID:19104047
Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.

PubMed

Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques

2008-01-01

This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.
Powerful Identification of Cis-regulatory SNPs in Human Primary Monocytes Using Allele-Specific Gene Expression

PubMed Central

Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Maouche, Seraya; Göring, Harald H. H.; Liljedahl, Ulrika; Enström, Camilla; Brocheton, Jessy; Proust, Carole; Godefroy, Tiphaine; Sambrook, Jennifer G.; Jolley, Jennifer; Crisp-Hihn, Abigail; Foad, Nicola; Lloyd-Jones, Heather; Stephens, Jonathan; Gwilliam, Rhian; Rice, Catherine M.; Hengstenberg, Christian; Samani, Nilesh J.; Erdmann, Jeanette; Schunkert, Heribert; Pastinen, Tomi; Deloukas, Panos; Goodall, Alison H.; Ouwehand, Willem H.; Cambien, François; Syvänen, Ann-Christine

2012-01-01

A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers. PMID:23300628
Detecting DNA regulatory motifs by incorporating positional trendsin information content

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kechris, Katherina J.; van Zwet, Erik; Bickel, Peter J.

2004-05-04

On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.
EGRINs (Environmental Gene Regulatory Influence Networks) in Rice That Function in the Response to Water Deficit, High Temperature, and Agricultural Environments[OPEN

PubMed Central

Hafemeister, Christoph; Nicotra, Adrienne B.; Jagadish, S.V. Krishna; Bonneau, Richard; Purugganan, Michael

2016-01-01

Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference. PMID:27655842
The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

PubMed Central

Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

2015-01-01

Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227
Regulation of TCF ETS-domain transcription factors by helix-loop-helix motifs.

PubMed

Stinson, Julie; Inoue, Toshiaki; Yates, Paula; Clancy, Anne; Norton, John D; Sharrocks, Andrew D

2003-08-15

DNA binding by the ternary complex factor (TCF) subfamily of ETS-domain transcription factors is tightly regulated by intramolecular and intermolecular interactions. The helix-loop-helix (HLH)-containing Id proteins are trans-acting negative regulators of DNA binding by the TCFs. In the TCF, SAP-2/Net/ERP, intramolecular inhibition of DNA binding is promoted by the cis-acting NID region that also contains an HLH-like motif. The NID also acts as a transcriptional repression domain. Here, we have studied the role of HLH motifs in regulating DNA binding and transcription by the TCF protein SAP-1 and how Cdk-mediated phosphorylation affects the inhibitory activity of the Id proteins towards the TCFs. We demonstrate that the NID region of SAP-1 is an autoinhibitory motif that acts to inhibit DNA binding and also functions as a transcription repression domain. This region can be functionally replaced by fusion of Id proteins to SAP-1, whereby the Id moiety then acts to repress DNA binding in cis. Phosphorylation of the Ids by cyclin-Cdk complexes results in reduction in protein-protein interactions between the Ids and TCFs and relief of their DNA-binding inhibitory activity. In revealing distinct mechanisms through which HLH motifs modulate the activity of TCFs, our results therefore provide further insight into the role of HLH motifs in regulating TCF function and how the inhibitory properties of the trans-acting Id HLH proteins are themselves regulated by phosphorylation.
PAA, WSH, and CIS Overview Self-Study #47656

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schroeder, Rachel Anne

This course presents an overview of the Department of Energy’s (DOE’s) regulatory requirements relevant to the Price-Anderson Amendments Act (PAAA, also referred to as nuclear safety), worker safety and health (WSH), and classified information security (CIS) that are enforceable under the DOE enforcement program; describes the DOE enforcement process; and provides an overview of Los Alamos National Laboratory’s (LANL’s) internal compliance program relative to these DOE regulatory requirements. The LANL PAAA Program is responsible for maintaining LANL’s internal compliance program, which ensures the prompt identification, screening, and reporting of noncompliances to DOE regulatory requirements pertaining to nuclear safety, WSH, andmore » CIS to build the strongest mitigation position for the Laboratory with respect to civil or other penalties.« less
Modular Evolution of DNA-Binding Preference of a Tbrain Transcription Factor Provides a Mechanism for Modifying Gene Regulatory Networks

PubMed Central

Cheatle Jarvela, Alys M.; Brubaker, Lisa; Vedenko, Anastasia; Gupta, Anisha; Armitage, Bruce A.; Bulyk, Martha L.; Hinman, Veronica F.

2014-01-01

Gene regulatory networks (GRNs) describe the progression of transcriptional states that take a single-celled zygote to a multicellular organism. It is well documented that GRNs can evolve extensively through mutations to cis-regulatory modules (CRMs). Transcription factor proteins that bind these CRMs may also evolve to produce novelty. Coding changes are considered to be rarer, however, because transcription factors are multifunctional and hence are more constrained to evolve in ways that will not produce widespread detrimental effects. Recent technological advances have unearthed a surprising variation in DNA-binding abilities, such that individual transcription factors may recognize both a preferred primary motif and an additional secondary motif. This provides a source of modularity in function. Here, we demonstrate that orthologous transcription factors can also evolve a changed preference for a secondary binding motif, thereby offering an unexplored mechanism for GRN evolution. Using protein-binding microarray, surface plasmon resonance, and in vivo reporter assays, we demonstrate an important difference in DNA-binding preference between Tbrain protein orthologs in two species of echinoderms, the sea star, Patiria miniata, and the sea urchin, Strongylocentrotus purpuratus. Although both orthologs recognize the same primary motif, only the sea star Tbr also has a secondary binding motif. Our in vivo assays demonstrate that this difference may allow for greater evolutionary change in timing of regulatory control. This uncovers a layer of transcription factor binding divergence that could exist for many pairs of orthologs. We hypothesize that this divergence provides modularity that allows orthologous transcription factors to evolve novel roles in GRNs through modification of binding to secondary sites. PMID:25016582
Differences in Krox20-dependent regulation of Hoxa2 and Hoxb2 during hindbrain development.

PubMed

Maconochie, M K; Nonchev, S; Manzanares, M; Marshall, H; Krumlauf, R

2001-05-15

During hindbrain development, segmental regulation of the paralogous Hoxa2 and Hoxb2 genes in rhombomeres (r) 3 and 5 involves Krox20-dependent enhancers that have been conserved during the duplication of the vertebrate Hox clusters from a common ancestor. Examining these evolutionarily related control regions could provide important insight into the degree to which the basic Krox20-dependent mechanisms, cis-regulatory components, and their organization have been conserved. Toward this goal we have performed a detailed functional analysis of a mouse Hoxa2 enhancer capable of directing reporter expression in r3 and r5. The combined activities of five separate cis-regions, in addition to the conserved Krox20 binding sites, are involved in mediating enhancer function. A CTTT (BoxA) motif adjacent to the Krox20 binding sites is important for r3/r5 activity. The BoxA motif is similar to one (Box1) found in the Hoxb2 enhancer and indicates that the close proximity of these Box motifs to Krox20 sites is a common feature of Krox20 targets in vivo. Two other rhombomeric elements (RE1 and RE3) are essential for r3/r5 activity and share common TCT motifs, indicating that they interact with a similar cofactor(s). TCT motifs are also found in the Hoxb2 enhancer, suggesting that they may be another common feature of Krox20-dependent control regions. The two remaining Hoxa2 cis-elements, RE2 and RE4, are not conserved in the Hoxb2 enhancer and define differences in some of components that can contribute to the Krox20-dependent activities of these enhancers. Furthermore, analysis of regulatory activities of these enhancers in a Krox20 mutant background has uncovered differences in their degree of dependence upon Krox20 for segmental expression. Together, this work has revealed a surprising degree of complexity in the number of cis-elements and regulatory components that contribute to segmental expression mediated by Krox20 and sheds light on the diversity and evolution of Krox20
Distal Limb Patterning Requires Modulation of cis-Regulatory Activities by HOX13

DOE PAGES

Sheth, Rushikesh; Barozzi, Iros; Langlais, David; ...

2016-12-13

The combinatorial expression of Hox genes along the body axes is a major determinant of cell fate and plays a pivotal role in generating the animal body plan. Loss of HOXA13 and HOXD13 transcription factors (HOX13) leads to digit agenesis in mice, but how HOX13 proteins regulate transcriptional outcomes and confer identity to the distal-most limb cells has remained elusive. Here, we report on the genome-wide profiling of HOXA13 and HOXD13 in vivo binding and changes of the transcriptome and chromatin state in the transition from the early to the late-distal limb developmental program, as well as in Hoxa13–/–; Hoxd13–/– limbs. Ourmore » results show that proper termination of the early limb transcriptional program and activation of the late-distal limb program are coordinated by the dual action of HOX13 on cis-regulatory modules.« less
A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

PubMed Central

2012-01-01

Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs) in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP) data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF) binding site motif across several data sets. We
PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

PubMed Central

Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

2001-01-01

Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
Mechanistic insights into Pin1 peptidyl-prolyl cis-trans isomerization from umbrella sampling simulations.

PubMed

Di Martino, Giovanni Paolo; Masetti, Matteo; Cavalli, Andrea; Recanatini, Maurizio

2014-11-01

The peptidyl-proyl isomerase Pin1 plays a key role in the regulation of phospho(p)-Ser/Thr-Pro proteins, acting as a molecular timer of the cell cycle. After recognition of these motifs, Pin1 catalyzes the rapid cis-trans isomerization of proline amide bonds of substrates, contributing to maintain the equilibrium between the two conformations. Although a great interest has arisen on this enzyme, its catalytic mechanism has long been debated. Here, the cis-trans isomerization of a model peptide system was investigated by means of umbrella sampling simulations in the Pin1-bound and unbound states. We obtained free energy barriers consistent with experimental data, and identified several enzymatic features directly linked to the acceleration of the prolyl bond isomerization. In particular, an enhanced autocatalysis, the stabilization of perturbed ground state conformations, and the substrate binding in a procatalytic conformation were found as main contributions to explain the lowering of the isomerization free energy barrier. © 2014 Wiley Periodicals, Inc.

Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis.

PubMed

Loots, Gabriela G

2008-01-01

Despite remarkable recent advances in genomics that have enabled us to identify most of the genes in the human genome, comparable efforts to define transcriptional cis-regulatory elements that control gene expression are lagging behind. The difficulty of this task stems from two equally important problems: our knowledge of how regulatory elements are encoded in genomes remains elementary, and there is a vast genomic search space for regulatory elements, since most of mammalian genomes are noncoding. Comparative genomic approaches are having a remarkable impact on the study of transcriptional regulation in eukaryotes and currently represent the most efficient and reliable methods of predicting noncoding sequences likely to control the patterns of gene expression. By subjecting eukaryotic genomic sequences to computational comparisons and subsequent experimentation, we are inching our way toward a more comprehensive catalog of common regulatory motifs that lie behind fundamental biological processes. We are still far from comprehending how the transcriptional regulatory code is encrypted in the human genome and providing an initial global view of regulatory gene networks, but collectively, the continued development of comparative and experimental approaches will rapidly expand our knowledge of the transcriptional regulome.
The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

PubMed

Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

2016-01-01

The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.
The Transcriptional Complex Between the BCL2 i-Motif and hnRNP LL Is a Molecular Switch for Control of Gene Expression That Can Be Modulated by Small Molecules

PubMed Central

2015-01-01

In a companion paper (DOI: 10.021/ja410934b) we demonstrate that the C-rich strand of the cis-regulatory element in the BCL2 promoter element is highly dynamic in nature and can form either an i-motif or a flexible hairpin. Under physiological conditions these two secondary DNA structures are found in an equilibrium mixture, which can be shifted by the addition of small molecules that trap out either the i-motif (IMC-48) or the flexible hairpin (IMC-76). In cellular experiments we demonstrate that the addition of these molecules has opposite effects on BCL2 gene expression and furthermore that these effects are antagonistic. In this contribution we have identified a transcriptional factor that recognizes and binds to the BCL2 i-motif to activate transcription. The molecular basis for the recognition of the i-motif by hnRNP LL is determined, and we demonstrate that the protein unfolds the i-motif structure to form a stable single-stranded complex. In subsequent experiments we show that IMC-48 and IMC-76 have opposite, antagonistic effects on the formation of the hnRNP LL–i-motif complex as well as on the transcription factor occupancy at the BCL2 promoter. For the first time we propose that the i-motif acts as a molecular switch that controls gene expression and that small molecules that target the dynamic equilibrium of the i-motif and the flexible hairpin can differentially modulate gene expression. PMID:24559432
The evolutionary capacitor HSP90 buffers the regulatory effects of mammalian endogenous retroviruses.

PubMed

Hummel, Barbara; Hansen, Erik C; Yoveva, Aneliya; Aprile-Garcia, Fernando; Hussong, Rebecca; Sawarkar, Ritwick

2017-03-01

Understanding how genotypes are linked to phenotypes is important in biomedical and evolutionary studies. The chaperone heat-shock protein 90 (HSP90) buffers genetic variation by stabilizing proteins with variant sequences, thereby uncoupling phenotypes from genotypes. Here we report an unexpected role of HSP90 in buffering cis-regulatory variation affecting gene expression. By using the tripartite-motif-containing 28 (TRIM28; also known as KAP1)-mediated epigenetic pathway, HSP90 represses the regulatory influence of endogenous retroviruses (ERVs) on neighboring genes that are critical for mouse development. Our data based on natural variations in the mouse genome show that genes respond to HSP90 inhibition in a manner dependent on their genomic location with regard to strain-specific ERV-insertion sites. The evolutionary-capacitor function of HSP90 may thus have facilitated the exaptation of ERVs as key modifiers of gene expression and morphological diversification. Our findings add a new regulatory layer through which HSP90 uncouples phenotypic outcomes from individual genotypes.
RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay.

PubMed

Dean, Kimberly M; Grayhack, Elizabeth J

2012-12-01

We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression.
The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

PubMed

Kiesel, Anja; Roth, Christian; Ge, Wanwan; Wess, Maximilian; Meier, Markus; Söding, Johannes

2018-05-28

The BaMM web server offers four tools: (i) de-novo discovery of enriched motifs in a set of nucleotide sequences, (ii) scanning a set of nucleotide sequences with motifs to find motif occurrences, (iii) searching with an input motif for similar motifs in our BaMM database with motifs for >1000 transcription factors, trained from the GTRD ChIP-seq database and (iv) browsing and keyword searching the motif database. In contrast to most other servers, we represent sequence motifs not by position weight matrices (PWMs) but by Bayesian Markov Models (BaMMs) of order 4, which we showed previously to perform substantially better in ROC analyses than PWMs or first order models. To address the inadequacy of P- and E-values as measures of motif quality, we introduce the AvRec score, the average recall over the TP-to-FP ratio between 1 and 100. The BaMM server is freely accessible without registration at https://bammmotif.mpibpc.mpg.de.
In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa

PubMed Central

Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur

2017-01-01

Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5’ UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants. PMID:28910327
In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa.

PubMed

Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur

2017-01-01

Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5' UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants.
Retinal Expression of the Drosophila eyes absent Gene Is Controlled by Several Cooperatively Acting Cis-regulatory Elements

PubMed Central

Neuman, Sarah D.; Bashirullah, Arash; Kumar, Justin P.

2016-01-01

The eyes absent (eya) gene of the fruit fly, Drosophila melanogaster, is a member of an evolutionarily conserved gene regulatory network that controls eye formation in all seeing animals. The loss of eya leads to the complete elimination of the compound eye while forced expression of eya in non-retinal tissues is sufficient to induce ectopic eye formation. Within the developing retina eya is expressed in a dynamic pattern and is involved in tissue specification/determination, cell proliferation, apoptosis, and cell fate choice. In this report we explore the mechanisms by which eya expression is spatially and temporally governed in the developing eye. We demonstrate that multiple cis-regulatory elements function cooperatively to control eya transcription and that spacing between a pair of enhancer elements is important for maintaining correct gene expression. Lastly, we show that the loss of eya expression in sine oculis (so) mutants is the result of massive cell death and a progressive homeotic transformation of retinal progenitor cells into head epidermis. PMID:27930646
A Novel Dual-cre Motif Enables Two-Way Autoregulation of CcpA in Clostridium acetobutylicum.

PubMed

Zhang, Lu; Liu, Yanqiang; Yang, Yunpeng; Jiang, Weihong; Gu, Yang

2018-04-15

The master regulator CcpA (catabolite control protein A) manages a large and complex regulatory network that is essential for cellular physiology and metabolism in Gram-positive bacteria. Although CcpA can affect the expression of target genes by binding to a cis -acting catabolite-responsive element ( cre ), whether and how the expression of CcpA is regulated remain poorly explored. Here, we report a novel dual- cre motif that is employed by the CcpA in Clostridium acetobutylicum , a typical solventogenic Clostridium species, for autoregulation. Two cre sites are involved in CcpA autoregulation, and they reside in the promoter and coding regions of CcpA. In this dual- cre motif, cre P , in the promoter region, positively regulates ccpA transcription, whereas cre ORF , in the coding region, negatively regulates this transcription, thus enabling two-way autoregulation of CcpA. Although CcpA bound cre P more strongly than cre ORF in vitro , the in vivo assay showed that cre ORF -based repression dominates CcpA autoregulation during the entire fermentation. Finally, a synonymous mutation of cre ORF was made within the coding region, achieving an increased intracellular CcpA expression and improved cellular performance. This study provides new insights into the regulatory role of CcpA in C. acetobutylicum and, moreover, contributes a new engineering strategy for this industrial strain. IMPORTANCE CcpA is known to be a key transcription factor in Gram-positive bacteria. However, it is still unclear whether and how the intracellular CcpA level is regulated, which may be essential for maintaining normal cell physiology and metabolism. We discovered here that CcpA employs a dual- cre motif to autoregulate, enabling dynamic control of its own expression level during the entire fermentation process. This finding answers the questions above and fills a void in our understanding of the regulatory network of CcpA. Interference in CcpA autoregulation leads to improved cellular
Discovery of a Regulatory Motif for Human Satellite DNA Transcription in Response to BATF2 Overexpression.

PubMed

Bai, Xuejia; Huang, Wenqiu; Zhang, Chenguang; Niu, Jing; Ding, Wei

2016-03-01

One of the basic leucine zipper transcription factors, BATF2, has been found to suppress cancer growth and migration. However, little is known about the genes downstream of BATF2. HeLa cells were stably transfected with BATF2, then chromatin immunoprecipitation-sequencing was employed to identify the DNA motifs responsive to BATF2. Comprehensive bioinformatics analyses indicated that the most significant motif discovered as TTCCATT[CT]GATTCCATTC[AG]AT was primarily distributed among the chromosome centromere regions and mostly within human type II satellite DNA. Such motifs were able to prime the transcription of type II satellite DNA in a directional and asymmetrical manner. Consistently, satellite II transcription was up-regulated in BATF2-overexpressing cells. The present study provides insight into understanding the role of BATF2 in tumours and the importance of satellite DNA in the maintenance of genomic stability. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Computational Analyses of Synergism in Small Molecular Network Motifs

PubMed Central

Zhang, Yili; Smolen, Paul; Baxter, Douglas A.; Byrne, John H.

2014-01-01

Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically) to alter the responses of the motifs to stimuli. Synergism (or antagonism) was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions. PMID:24651495
Nitrogen transporter and assimilation genes exhibit developmental stage-selective expression in maize (Zea mays L.) associated with distinct cis-acting promoter motifs.

PubMed

Liseron-Monfils, Christophe; Bi, Yong-Mei; Downs, Gregory S; Wu, Wenqing; Signorelli, Tara; Lu, Guangwen; Chen, Xi; Bondo, Eddie; Zhu, Tong; Lukens, Lewis N; Colasanti, Joseph; Rothstein, Steven J; Raizada, Manish N

2013-10-01

Nitrogen is considered the most limiting nutrient for maize (Zea mays L.), but there is limited understanding of the regulation of nitrogen-related genes during maize development. An Affymetrix 82K maize array was used to analyze the expression of ≤ 46 unique nitrogen uptake and assimilation probes in 50 maize tissues from seedling emergence to 31 d after pollination. Four nitrogen-related expression clusters were identified in roots and shoots corresponding to, or overlapping, juvenile, adult, and reproductive phases of development. Quantitative real time PCR data was consistent with the existence of these distinct expression clusters. Promoters corresponding to each cluster were screened for over-represented cis-acting elements. The 8-bp distal motif of the Arabidopsis 43-bp nitrogen response element (NRE) was over-represented in nitrogen-related maize gene promoters. This conserved motif, referred to here as NRE43-d8, was previously shown to be critical for nitrate-activated transcription of nitrate reductase (NIA1) and nitrite reductase (NIR1) by the NIN-LIKE PROTEIN 6 (NLP6) in Arabidopsis. Here, NRE43-d8 was over-represented in the promoters of maize nitrate and ammonium transporter genes, specifically those that showed peak expression during early-stage vegetative development. This result predicts an expansion of the NRE-NLP6 regulon and suggests that it may have a developmental component in maize. We also report leaf expression of putative orthologs of nitrite transporters (NiTR1), a transporter not previously reported in maize. We conclude by discussing how each of the four transcriptional modules may be responsible for the different nitrogen uptake and assimilation requirements of leaves and roots at different stages of maize development.
Proper regulation of a sperm-specific cis-nat-siRNA is essential for double fertilization in Arabidopsis

USDA-ARS?s Scientific Manuscript database

/Cis/-nat-siRNAs are a recently characterized class of small regulatory RNAs that are widespread in eukaryotes. Despite their abundance the importance of their regulatory activity is largely unknown. The only functional role for eukaryotic /cis/-nat-siRNAs that has been described to date is in envir...
Regulatory analysis of the mouse Hoxb3 gene: multiple elements work in concert to direct temporal and spatial patterns of expression.

PubMed

Kwan, C T; Tsang, S L; Krumlauf, R; Sham, M H

2001-04-01

The expression pattern of the mouse Hoxb3 gene is exceptionally complex and dynamic compared with that of other members of the Hoxb cluster. There are multiple types of transcripts for Hoxb3 gene, and the anterior boundaries of its expression vary at different stages of development. Two enhancers flanking Hoxb3 on the 3' and 5' sides regulate Hoxb2 and Hoxb4, respectively, and these control regions define the two ends of a 28-kb interval in and around the Hoxb3 locus. To assay the regulatory potential of DNA fragments in this interval we have used transgenic analysis with a lacZ reporter gene to locate cis-elements for directing the dynamic patterns of Hoxb3 expression. Our detailed analysis has identified four new and widely spaced cis-acting regulatory regions that can together account for major aspects of the Hoxb3 expression pattern. Elements Ib, IIIa, and IVb control gene expression in neural and mesodermal tissues; element Va controls mesoderm-specific gene expression. The most anterior neural expression domain of Hoxb3 is controlled by an r5 enhancer (element IVa); element IIIa directs reporter expression in the anterior spinal cord and hindbrain up to r6, and the region A enhancer (in element I) mediates posterior neural expression. Hence, the regulation of segmental expression of Hoxb3 in the hindbrain is different from that of Hoxa3, as two separate enhancer elements contribute to expression in r5 and r6. The mesoderm-specific element (Va) directs reporter expression to prevertebra C1 at 12.5 dpc, which is the anterior limit of paraxial mesoderm expression for Hoxb3. When tested in combinations, these cis-elements appear to work as modules in an additive manner to recapitulate the major endogenous expression patterns of Hoxb3 during embryogenesis. Together our study shows that multiple control elements direct reporter gene expression in diverse tissue-, temporal-, and spatially restricted subset of the endogenous Hoxb3 expression domains and work in
Generation of Chimeric RNAs by cis-splicing of adjacent genes (cis-SAGe) in mammals.

PubMed

Zhuo, Jian-Shu; Jing, Xiao-Yan; Du, Xin; Yang, Xiu-Qin

2018-02-20

Chimeric RNA molecules, possessing exons from two or more independent genes, are traditionally believed to be produced by chromosome rearrangement. However, recent studies revealed that cis-splicing of adjacent genes (cis- SAGe) is one of the major mechanisms underlying the formation of chimeric RNAs. cis-SAGe refers to intergenic splicing of directly adjacent genes with the same transcriptional orientation, resulting in read-through transcripts, termed chimeric RNAs, which contain sequences from two or more parental genes. cis-SAGe was first identified in tumor cells, since then its potential in carcinogenesis has attracted extensive attention. More and more scientists are focusing on it. With the development of research, cis-SAGe was found to be ubiquitous in various normal tissues, and might make a crucial contribution to the formation of novel genes in the evolution of genomes. In this review, we summarize the splicing pattern, expression characteristics, possible mechanisms, and significance of cis-SAGe in mammals. This review will be helpful for general understanding of the current status and development tendency of cis-SAGe.
Cloning and Characterization of 5′ Flanking Regulatory Sequences of AhLEC1B Gene from Arachis Hypogaea L.

PubMed Central

Tang, Guiying; Xu, Pingli; Liu, Wei; Liu, Zhanji; Shan, Lei

2015-01-01

LEAFY COTYLEDON1 (LEC1) is a B subunit of Nuclear Factor Y (NF-YB) transcription factor that mainly accumulates during embryo development. We cloned the 5′ flanking regulatory sequence of AhLEC1B gene, a homolog of Arabidopsis LEC1, and analyzed its regulatory elements using online software. To identify the crucial regulatory region, we generated a series of GUS expression frameworks driven by different length promoters with 5′ terminal and/or 3′ terminal deletion. We further characterized the GUS expression patterns in the transgenic Arabidopsis lines. Our results show that both the 65bp proximal promoter region and the 52bp 5′ UTR of AhLEC1B contain the key motifs required for the essential promoting activity. Moreover, AhLEC1B is preferentially expressed in the embryo and is co-regulated by binding of its upstream genes with both positive and negative corresponding cis-regulatory elements. PMID:26426444
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

PubMed Central

Karnik, Rahul; Beer, Michael A.

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

PubMed

Karnik, Rahul; Beer, Michael A

2015-01-01

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
Discovering Sequence Motifs with Arbitrary Insertions and Deletions

PubMed Central

Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

2008-01-01

Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229

Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

PubMed Central

Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

2009-01-01

To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368
ModuleMiner - improved computational detection of cis-regulatory modules: are there different modes of gene regulation in embryonic development and adult tissues?

PubMed Central

Van Loo, Peter; Aerts, Stein; Thienpont, Bernard; De Moor, Bart; Moreau, Yves; Marynen, Peter

2008-01-01

We present ModuleMiner, a novel algorithm for computationally detecting cis-regulatory modules (CRMs) in a set of co-expressed genes. ModuleMiner outperforms other methods for CRM detection on benchmark data, and successfully detects CRMs in tissue-specific microarray clusters and in embryonic development gene sets. Interestingly, CRM predictions for differentiated tissues exhibit strong enrichment close to the transcription start site, whereas CRM predictions for embryonic development gene sets are depleted in this region. PMID:18394174
Core regulatory network motif underlies the ocellar complex patterning in Drosophila melanogaster

NASA Astrophysics Data System (ADS)

Aguilar-Hidalgo, D.; Lemos, M. C.; Córdoba, A.

2015-03-01

During organogenesis, developmental programs governed by Gene Regulatory Networks (GRN) define the functionality, size and shape of the different constituents of living organisms. Robustness, thus, is an essential characteristic that GRNs need to fulfill in order to maintain viability and reproducibility in a species. In the present work we analyze the robustness of the patterning for the ocellar complex formation in Drosophila melanogaster fly. We have systematically pruned the GRN that drives the development of this visual system to obtain the minimum pathway able to satisfy this pattern. We found that the mechanism underlying the patterning obeys to the dynamics of a 3-nodes network motif with a double negative feedback loop fed by a morphogenetic gradient that triggers the inhibition in a French flag problem fashion. A Boolean modeling of the GRN confirms robustness in the patterning mechanism showing the same result for different network complexity levels. Interestingly, the network provides a steady state solution in the interocellar part of the patterning and an oscillatory regime in the ocelli. This theoretical result predicts that the ocellar pattern may underlie oscillatory dynamics in its genetic regulation.
Cis-acting elements in the promoter region of the human aldolase C gene.

PubMed

Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

1993-08-16

We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.
Layer-specific chromatin accessibility landscapes reveal regulatory networks in adult mouse visual cortex

PubMed Central

Gray, Lucas T; Yao, Zizhen; Nguyen, Thuc Nghi; Kim, Tae Kyung; Zeng, Hongkui; Tasic, Bosiljka

2017-01-01

Mammalian cortex is a laminar structure, with each layer composed of a characteristic set of cell types with different morphological, electrophysiological, and connectional properties. Here, we define chromatin accessibility landscapes of major, layer-specific excitatory classes of neurons, and compare them to each other and to inhibitory cortical neurons using the Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq). We identify a large number of layer-specific accessible sites, and significant association with genes that are expressed in specific cortical layers. Integration of these data with layer-specific transcriptomic profiles and transcription factor binding motifs enabled us to construct a regulatory network revealing potential key layer-specific regulators, including Cux1/2, Foxp2, Nfia, Pou3f2, and Rorb. This dataset is a valuable resource for identifying candidate layer-specific cis-regulatory elements in adult mouse cortex. DOI: http://dx.doi.org/10.7554/eLife.21883.001 PMID:28112643
Allelic expression mapping across cellular lineages to establish impact of non-coding SNPs

PubMed Central

Adoue, Veronique; Schiavi, Alicia; Light, Nicholas; Almlöf, Jonas Carlsson; Lundmark, Per; Ge, Bing; Kwan, Tony; Caron, Maxime; Rönnblom, Lars; Wang, Chuan; Chen, Shu-Huang; Goodall, Alison H; Cambien, Francois; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine; Pastinen, Tomi

2014-01-01

Most complex disease-associated genetic variants are located in non-coding regions and are therefore thought to be regulatory in nature. Association mapping of differential allelic expression (AE) is a powerful method to identify SNPs with direct cis-regulatory impact (cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found 40–60% of these cis-rSNPs to be shared across cell types. We uncover a new class of cis-rSNPs, which disrupt footprint-derived de novo motifs that are predominantly bound by repressive factors and are implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new approach for genome-wide functional validation of transcription factor–SNP interactions. By perturbing NFκB action in lymphoblasts, we identified 489 cis-regulated transcripts with altered AE after NFκB perturbation. Altogether, we perform a comprehensive analysis of cis-variation in four cell populations and provide new tools for the identification of functional variants associated to complex diseases. PMID:25326100
Reconstructing directed gene regulatory network by only gene expression data.

PubMed

Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng

2016-08-18

Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

PubMed

Catania, Francesco; Lynch, Michael

2010-05-04

In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Thermal cis-trans isomerization of cis,cis-3,7-decadiene - A model for cis-1,4-polybutadiene

NASA Technical Reports Server (NTRS)

Golub, M. A.; Lee, W. M.

1983-01-01

The thermal cis-trans isomerization of cis,cis-3,7-decadiene (DD), a model compound for cis-PBD, is reported. It is demonstrated that the rather low E for the polyalkenamer isomerizations compared with that for the 2-olefins is not an artifact of the solid polymer structures, but rather is characteristic of both small and large molecules possessing pairs of nonconjugated vinylene double bonds in a suitable arrangement.
Regulatory motifs for CREB-binding protein and Nfe2l2 transcription factors in the upstream enhancer of the mitochondrial uncoupling protein 1 gene.

PubMed

Rim, Jong S; Kozak, Leslie P

2002-09-13

Thermogenesis against cold exposure in mammals occurs in brown adipose tissue (BAT) through mitochondrial uncoupling protein (UCP1). Expression of the Ucp1 gene is unique in brown adipocytes and is regulated tightly. The 5'-flanking region of the mouse Ucp1 gene contains cis-acting elements including PPRE, TRE, and four half-site cAMP-responsive elements (CRE) with BAT-specific enhancer elements. In the course of analyzing how these half-site CREs are involved in Ucp1 expression, we found that a DNA regulatory element for NF-E2 overlaps CRE2. Electrophoretic mobility shift assay and competition assays with the CRE2 element indicates that nuclear proteins from BAT, inguinal fat, and retroperitoneal fat tissue interact with the CRE2 motif (CGTCA) in a specific manner. A supershift assay using an antibody against the CRE-binding protein (CREB) shows specific affinity to the complex from CRE2 and nuclear extract of BAT. Additionally, Western blot analysis for phospho-CREB/ATF1 shows an increase in phosphorylation of CREB/ATF1 in HIB-1B cells after norepinephrine treatment. Transient transfection assay using luciferase reporter constructs also indicates that the two half-site CREs are involved in transcriptional regulation of Ucp1 in response to norepinephrine and cAMP. We also show that a second DNA regulatory element for NF-E2 is located upstream of the CRE2 region. This element, which is found in a similar location in the 5'-flanking region of the human and rodent Ucp1 genes, shows specific binding to rat and human NF-E2 by electrophoretic mobility shift assay with nuclear extracts from brown fat. Co-transfections with an Nfe2l2 expression vector and a luciferase reporter construct of the Ucp1 enhancer region provide additional evidence that Nfe2l2 is involved in the regulation of Ucp1 by cAMP-mediated signaling.
SCOPE: a web server for practical de novo motif discovery.

PubMed

Carlson, Jonathan M; Chakravarty, Arijit; DeZiel, Charles E; Gross, Robert H

2007-07-01

SCOPE is a novel parameter-free method for the de novo identification of potential regulatory motifs in sets of coordinately regulated genes. The SCOPE algorithm combines the output of three component algorithms, each designed to identify a particular class of motifs. Using an ensemble learning approach, SCOPE identifies the best candidate motifs from its component algorithms. In tests on experimentally determined datasets, SCOPE identified motifs with a significantly higher level of accuracy than a number of other web-based motif finders run with their default parameters. Because SCOPE has no adjustable parameters, the web server has an intuitive interface, requiring only a set of gene names or FASTA sequences and a choice of species. The most significant motifs found by SCOPE are displayed graphically on the main results page with a table containing summary statistics for each motif. Detailed motif information, including the sequence logo, PWM, consensus sequence and specific matching sites can be viewed through a single click on a motif. SCOPE's efficient, parameter-free search strategy has enabled the development of a web server that is readily accessible to the practising biologist while providing results that compare favorably with those of other motif finders. The SCOPE web server is at .
Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.

PubMed

Müller-Molina, Arnoldo J; Schöler, Hans R; Araúzo-Bravo, Marcos J

2012-01-01

To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.
Comprehensive Human Transcription Factor Binding Site Map for Combinatory Binding Motifs Discovery

PubMed Central

Müller-Molina, Arnoldo J.; Schöler, Hans R.; Araúzo-Bravo, Marcos J.

2012-01-01

To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%–20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory “DNA words.” From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%—far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of “DNA words,” newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters. PMID:23209563
Regulatory Evolution and Theoretical Arguments in Evolutionary Biology

ERIC Educational Resources Information Center

Ioannidis, Stavros

2013-01-01

The "cis"-regulatory hypothesis is one of the most important claims of evolutionary developmental biology. In this paper I examine the theoretical argument for "cis"-regulatory evolution and its role within evolutionary theorizing. I show that, although the argument has some weaknesses, it acts as a useful example for the importance of current…
Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

PubMed

Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

2015-06-01

Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.
A 20 bp cis-acting element is both necessary and sufficient to mediate elicitor response of a maize PRms gene.

PubMed

Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B

1995-01-01

Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
Loss of intramolecular electrostatic interactions and limited conformational ensemble may promote self-association of cis-tau peptide.

PubMed

Barman, Arghya; Hamelberg, Donald

2015-03-01

Self-association of proteins can be triggered by a change in the distribution of the conformational ensemble. Posttranslational modification, such as phosphorylation, can induce a shift in the ensemble of conformations. In the brain of Alzheimer's disease patients, the formation of intra-cellular neurofibrillary tangles deposition is a result of self-aggregation of hyper-phosphorylated tau protein. Biochemical and NMR studies suggest that the cis peptidyl prolyl conformation of a phosphorylated threonine-proline motif in the tau protein renders tau more prone to aggregation than the trans isomer. However, little is known about the role of peptidyl prolyl cis/trans isomerization in tau aggregation. Here, we show that intra-molecular electrostatic interactions are better formed in the trans isomer. We explore the conformational landscape of the tau segment containing the phosphorylated-Thr(231)-Pro(232) motif using accelerated molecular dynamics and show that intra-molecular electrostatic interactions are coupled to the isomeric state of the peptidyl prolyl bond. Our results suggest that the loss of intra-molecular interactions and the more restricted conformational ensemble of the cis isomer could favor self-aggregation. The results are consistent with experiments, providing valuable complementary atomistic insights and a hypothetical model for isomer specific aggregation of the tau protein. © 2014 Wiley Periodicals, Inc.
Identification of Regulatory DNA Elements Using Genome-wide Mapping of DNase I Hypersensitive Sites during Tomato Fruit Development.

PubMed

Qiu, Zhengkun; Li, Ren; Zhang, Shuaibin; Wang, Ketao; Xu, Meng; Li, Jiayang; Du, Yongchen; Yu, Hong; Cui, Xia

2016-08-01

Development and ripening of tomato fruit are precisely controlled by transcriptional regulation, which depends on the orchestrated accessibility of regulatory proteins to promoters and other cis-regulatory DNA elements. This accessibility and its effect on gene expression play a major role in defining the developmental process. To understand the regulatory mechanism and functional elements modulating morphological and anatomical changes during fruit development, we generated genome-wide high-resolution maps of DNase I hypersensitive sites (DHSs) from the fruit tissues of the tomato cultivar "Moneymaker" at 20 days post anthesis as well as break stage. By exploring variation of DHSs across fruit development stages, we pinpointed the most likely hypersensitive sites related to development-specific genes. By detecting binding motifs on DHSs of these development-specific genes or genes in the ascorbic acid biosynthetic pathway, we revealed the common regulatory elements contributing to coordinating gene transcription of plant ripening and specialized metabolic pathways. Our results contribute to a better understanding of the regulatory dynamics of genes involved in tomato fruit development and ripening. Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.
Signatures of positive selection in the cis-regulatory sequences of the human oxytocin receptor (OXTR) and arginine vasopressin receptor 1a (AVPR1A) genes.

PubMed

Schaschl, Helmut; Huber, Susanne; Schaefer, Katrin; Windhager, Sonja; Wallner, Bernard; Fieder, Martin

2015-05-13

The evolutionary highly conserved neurohypophyseal hormones oxytocin and arginine vasopressin play key roles in regulating social cognition and behaviours. The effects of these two peptides are meditated by their specific receptors, which are encoded by the oxytocin receptor (OXTR) and arginine vasopressin receptor 1a genes (AVPR1A), respectively. In several species, polymorphisms in these genes have been linked to various behavioural traits. Little, however, is known about whether positive selection acts on sequence variants in genes influencing variation in human behaviours. We identified, in both neuroreceptor genes, signatures of balancing selection in the cis-regulative acting sequences such as transcription factor binding and enhancer sequences, as well as in a transcriptional repressor sequence motif. Additionally, in the intron 3 of the OXTR gene, the SNP rs59190448 appears to be under positive directional selection. For rs59190448, only one phenotypical association is known so far, but it is in high LD' (>0.8) with loci of known association; i.e., variants associated with key pro-social behaviours and mental disorders in humans. Only for one SNP on the OXTR gene (rs59190448) was a sign of positive directional selection detected with all three methods of selection detection. For rs59190448, however, only one phenotypical association is known, but rs59190448 is in high LD' (>0.8), with variants associated with important pro-social behaviours and mental disorders in humans. We also detected various signatures of balancing selection on both neuroreceptor genes.
Cellular automata simulation of topological effects on the dynamics of feed-forward motifs

PubMed Central

Apte, Advait A; Cain, John W; Bonchev, Danail G; Fong, Stephen S

2008-01-01

Background Feed-forward motifs are important functional modules in biological and other complex networks. The functionality of feed-forward motifs and other network motifs is largely dictated by the connectivity of the individual network components. While studies on the dynamics of motifs and networks are usually devoted to the temporal or spatial description of processes, this study focuses on the relationship between the specific architecture and the overall rate of the processes of the feed-forward family of motifs, including double and triple feed-forward loops. The search for the most efficient network architecture could be of particular interest for regulatory or signaling pathways in biology, as well as in computational and communication systems. Results Feed-forward motif dynamics were studied using cellular automata and compared with differential equation modeling. The number of cellular automata iterations needed for a 100% conversion of a substrate into a target product was used as an inverse measure of the transformation rate. Several basic topological patterns were identified that order the specific feed-forward constructions according to the rate of dynamics they enable. At the same number of network nodes and constant other parameters, the bi-parallel and tri-parallel motifs provide higher network efficacy than single feed-forward motifs. Additionally, a topological property of isodynamicity was identified for feed-forward motifs where different network architectures resulted in the same overall rate of the target production. Conclusion It was shown for classes of structural motifs with feed-forward architecture that network topology affects the overall rate of a process in a quantitatively predictable manner. These fundamental results can be used as a basis for simulating larger networks as combinations of smaller network modules with implications on studying synthetic gene circuits, small regulatory systems, and eventually dynamic whole-cell models

Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

PubMed Central

2010-01-01

Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586
Scan for Motifs: a webserver for the analysis of post-transcriptional regulatory elements in the 3' untranslated regions (3' UTRs) of mRNAs.

PubMed

Biswas, Ambarish; Brown, Chris M

2014-06-08

Gene expression in vertebrate cells may be controlled post-transcriptionally through regulatory elements in mRNAs. These are usually located in the untranslated regions (UTRs) of mRNA sequences, particularly the 3'UTRs. Scan for Motifs (SFM) simplifies the process of identifying a wide range of regulatory elements on alignments of vertebrate 3'UTRs. SFM includes identification of both RNA Binding Protein (RBP) sites and targets of miRNAs. In addition to searching pre-computed alignments, the tool provides users the flexibility to search their own sequences or alignments. The regulatory elements may be filtered by expected value cutoffs and are cross-referenced back to their respective sources and literature. The output is an interactive graphical representation, highlighting potential regulatory elements and overlaps between them. The output also provides simple statistics and links to related resources for complementary analyses. The overall process is intuitive and fast. As SFM is a free web-application, the user does not need to install any software or databases. Visualisation of the binding sites of different classes of effectors that bind to 3'UTRs will facilitate the study of regulatory elements in 3' UTRs.
Arabidopsis ensemble reverse-engineered gene regulatory network discloses interconnected transcription factors in oxidative stress.

PubMed

Vermeirssen, Vanessa; De Clercq, Inge; Van Parys, Thomas; Van Breusegem, Frank; Van de Peer, Yves

2014-12-01

The abiotic stress response in plants is complex and tightly controlled by gene regulation. We present an abiotic stress gene regulatory network of 200,014 interactions for 11,938 target genes by integrating four complementary reverse-engineering solutions through average rank aggregation on an Arabidopsis thaliana microarray expression compendium. This ensemble performed the most robustly in benchmarking and greatly expands upon the availability of interactions currently reported. Besides recovering 1182 known regulatory interactions, cis-regulatory motifs and coherent functionalities of target genes corresponded with the predicted transcription factors. We provide a valuable resource of 572 abiotic stress modules of coregulated genes with functional and regulatory information, from which we deduced functional relationships for 1966 uncharacterized genes and many regulators. Using gain- and loss-of-function mutants of seven transcription factors grown under control and salt stress conditions, we experimentally validated 141 out of 271 predictions (52% precision) for 102 selected genes and mapped 148 additional transcription factor-gene regulatory interactions (49% recall). We identified an intricate core oxidative stress regulatory network where NAC13, NAC053, ERF6, WRKY6, and NAC032 transcription factors interconnect and function in detoxification. Our work shows that ensemble reverse-engineering can generate robust biological hypotheses of gene regulation in a multicellular eukaryote that can be tested by medium-throughput experimental validation. © 2014 American Society of Plant Biologists. All rights reserved.
CompariMotif: quick and easy comparisons of sequence motifs.

PubMed

Edwards, Richard J; Davey, Norman E; Shields, Denis C

2008-05-15

CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/
cDNA cloning, genomic organization and expression analysis during somatic embryogenesis of the translationally controlled tumor protein (TCTP) gene from Japanese larch (Larix leptolepis).

PubMed

Zhang, Li-Feng; Li, Wan-Feng; Han, Su-Ying; Yang, Wen-Hua; Qi, Li-Wang

2013-10-15

A full-length cDNA and genomic sequences of a translationally controlled tumor protein (TCTP) gene were isolated from Japanese larch (Larix leptolepis) and designated LaTCTP. The length of the cDNA was 1, 043 bp and contained a 504 bp open reading frame that encodes a predicted protein of 167 amino acids, characterized by two signature sequences of the TCTP protein family. Analysis of the LaTCTP gene structure indicated four introns and five exons, and it is the largest of all currently known TCTP genes in plants. The 5'-flanking promoter region of LaTCTP was cloned using an improved TAIL-PCR technique. In this region we identified many important potential cis-acting elements, such as a Box-W1 (fungal elicitor responsive element), a CAT-box (cis-acting regulatory element related to meristem expression), a CGTCA-motif (cis-acting regulatory element involved in MeJA-responsiveness), a GT1-motif (light responsive element), a Skn-1-motif (cis-acting regulatory element required for endosperm expression) and a TGA-element (auxin-responsive element), suggesting that expression of LaTCTP is highly regulated. Expression analysis demonstrated ubiquitous localization of LaTCTP mRNA in the roots, stems and needles, high mRNA levels in the embryonal-suspensor mass (ESM), browning embryogenic cultures and mature somatic embryos, and low levels of mRNA at day five during somatic embryogenesis. We suggest that LaTCTP might participate in the regulation of somatic embryo development. These results provide a theoretical basis for understanding the molecular regulatory mechanism of LaTCTP and lay the foundation for artificial regulation of somatic embryogenesis. © 2013.
ELM: the status of the 2010 eukaryotic linear motif resource

PubMed Central

Gould, Cathryn M.; Diella, Francesca; Via, Allegra; Puntervoll, Pål; Gemünd, Christine; Chabanis-Davidson, Sophie; Michael, Sushama; Sayadi, Ahmed; Bryne, Jan Christian; Chica, Claudia; Seiler, Markus; Davey, Norman E.; Haslam, Niall; Weatheritt, Robert J.; Budd, Aidan; Hughes, Tim; Paś, Jakub; Rychlewski, Leszek; Travé, Gilles; Aasland, Rein; Helmer-Citterich, Manuela; Linding, Rune; Gibson, Toby J.

2010-01-01

Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a ‘Bar Code’ format, which also displays known instances from homologous proteins through a novel ‘Instance Mapper’ protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation. PMID:19920119
Diverse activities of viral cis-acting RNA regulatory elements revealed using multicolor, long-term, single-cell imaging

PubMed Central

Pocock, Ginger M.; Zimdars, Laraine L.; Yuan, Ming; Eliceiri, Kevin W.; Ahlquist, Paul; Sherer, Nathan M.

2017-01-01

Cis-acting RNA structural elements govern crucial aspects of viral gene expression. How these structures and other posttranscriptional signals affect RNA trafficking and translation in the context of single cells is poorly understood. Herein we describe a multicolor, long-term (>24 h) imaging strategy for measuring integrated aspects of viral RNA regulatory control in individual cells. We apply this strategy to demonstrate differential mRNA trafficking behaviors governed by RNA elements derived from three retroviruses (HIV-1, murine leukemia virus, and Mason-Pfizer monkey virus), two hepadnaviruses (hepatitis B virus and woodchuck hepatitis virus), and an intron-retaining transcript encoded by the cellular NXF1 gene. Striking behaviors include “burst” RNA nuclear export dynamics regulated by HIV-1’s Rev response element and the viral Rev protein; transient aggregations of RNAs into discrete foci at or near the nuclear membrane triggered by multiple elements; and a novel, pulsiform RNA export activity regulated by the hepadnaviral posttranscriptional regulatory element. We incorporate single-cell tracking and a data-mining algorithm into our approach to obtain RNA element–specific, high-resolution gene expression signatures. Together these imaging assays constitute a tractable, systems-based platform for studying otherwise difficult to access spatiotemporal features of viral and cellular gene regulation. PMID:27903772
Comparison of Insect Kinin Analogs With cis-Peptide Bond Motif 4-Aminopyroglutamate Identifies Optimal Stereochemistry for Diuretic Activity

DTIC Science & Technology

2006-01-01

Amino acid side-chain-protecting groups were Pbf for Arg and Boc for Trp. The coupling of Fmoc-4-amino- pyroglutamic acids (Fmoc-aPy-OH, Fmoc-apy-OH...Inc. Biopolymers (Pept Sci) 88:1–7, 2007. Keywords: 4-aminopyroglutamic acid ; cis-peptide bond; b-turn mimetic; constrained insect kinin analog...analogs containing three stereochemical var- iants of the (2S, 4S)-4-aminopyroglutamic acid (APy) com- ponent (see Figure 1), a mimic of the cis-peptide
Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

PubMed

Li, Sanshu; Breaker, Ronald R

2017-10-13

With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs
Identifying direct miRNA-mRNA causal regulatory relationships in heterogeneous data.

PubMed

Zhang, Junpeng; Le, Thuc Duy; Liu, Lin; Liu, Bing; He, Jianfeng; Goodall, Gregory J; Li, Jiuyong

2014-12-01

Discovering the regulatory relationships between microRNAs (miRNAs) and mRNAs is an important problem that interests many biologists and medical researchers. A number of computational methods have been proposed to infer miRNA-mRNA regulatory relationships, and are mostly based on the statistical associations between miRNAs and mRNAs discovered in observational data. The miRNA-mRNA regulatory relationships identified by these methods can be both direct and indirect regulations. However, differentiating direct regulatory relationships from indirect ones is important for biologists in experimental designs. In this paper, we present a causal discovery based framework (called DirectTarget) to infer direct miRNA-mRNA causal regulatory relationships in heterogeneous data, including expression profiles of miRNAs and mRNAs, and miRNA target information. DirectTarget is applied to the Epithelial to Mesenchymal Transition (EMT) datasets. The validation by experimentally confirmed target databases suggests that the proposed method can effectively identify direct miRNA-mRNA regulatory relationships. To explore the upstream regulators of miRNA regulation, we further identify the causal feedforward patterns (CFFPs) of TF-miRNA-mRNA to provide insights into the miRNA regulation in EMT. DirectTarget has the potential to be applied to other datasets to elucidate the direct miRNA-mRNA causal regulatory relationships and to explore the regulatory patterns. Copyright © 2014 Elsevier Inc. All rights reserved.
Arabidopsis Ensemble Reverse-Engineered Gene Regulatory Network Discloses Interconnected Transcription Factors in Oxidative Stress[W

PubMed Central

Vermeirssen, Vanessa; De Clercq, Inge; Van Parys, Thomas; Van Breusegem, Frank; Van de Peer, Yves

2014-01-01

The abiotic stress response in plants is complex and tightly controlled by gene regulation. We present an abiotic stress gene regulatory network of 200,014 interactions for 11,938 target genes by integrating four complementary reverse-engineering solutions through average rank aggregation on an Arabidopsis thaliana microarray expression compendium. This ensemble performed the most robustly in benchmarking and greatly expands upon the availability of interactions currently reported. Besides recovering 1182 known regulatory interactions, cis-regulatory motifs and coherent functionalities of target genes corresponded with the predicted transcription factors. We provide a valuable resource of 572 abiotic stress modules of coregulated genes with functional and regulatory information, from which we deduced functional relationships for 1966 uncharacterized genes and many regulators. Using gain- and loss-of-function mutants of seven transcription factors grown under control and salt stress conditions, we experimentally validated 141 out of 271 predictions (52% precision) for 102 selected genes and mapped 148 additional transcription factor-gene regulatory interactions (49% recall). We identified an intricate core oxidative stress regulatory network where NAC13, NAC053, ERF6, WRKY6, and NAC032 transcription factors interconnect and function in detoxification. Our work shows that ensemble reverse-engineering can generate robust biological hypotheses of gene regulation in a multicellular eukaryote that can be tested by medium-throughput experimental validation. PMID:25549671
Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

NASA Astrophysics Data System (ADS)

Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

2015-07-01

Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution.
Extraction of consensus protein patterns in regions containing non-proline cis peptide bonds and their functional assessment.

PubMed

Exarchos, Konstantinos P; Exarchos, Themis P; Rigas, Georgios; Papaloukas, Costas; Fotiadis, Dimitrios I

2011-05-10

In peptides and proteins, only a small percentile of peptide bonds adopts the cis configuration. Especially in the case of amide peptide bonds, the amount of cis conformations is quite limited thus hampering systematic studies, until recently. However, lately the emerging population of databases with more 3D structures of proteins has produced a considerable number of sequences containing non-proline cis formations (cis-nonPro). In our work, we extract regular expression-type patterns that are descriptive of regions surrounding the cis-nonPro formations. For this purpose, three types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, and iii) pattern discovery using a structural equivalency set. Afterwards, using each pattern as predicate, we search the Eukaryotic Linear Motif (ELM) resource to identify potential functional implications of regions with cis-nonPro peptide bonds. The patterns extracted from each type of pattern discovery are further employed, in order to formulate a pattern-based classifier, which is used to discriminate between cis-nonPro and trans-nonPro formations. In terms of functional implications, we observe a significant association of cis-nonPro peptide bonds towards ligand/binding functionalities. As for the pattern-based classification scheme, the highest results were obtained using the structural equivalency set, which yielded 70% accuracy, 77% sensitivity and 63% specificity.
Regulatory elements involved in tax-mediated transactivation of the HTLV-I LTR.

PubMed

Seeler, J S; Muchardt, C; Podar, M; Gaynor, R B

1993-10-01

HTLV-I is the etiologic agent of adult T-cell leukemia. In this study, we investigated the regulatory elements and cellular transcription factors which function in modulating HTLV-I gene expression in response to the viral transactivator protein, tax. Transfection experiments into Jurkat cells of a variety of site-directed mutants in the HTLV-1 LTR indicated that each of the three motifs A, B, and C within the 21-bp repeats, the binding sites for the Ets family of proteins, and the TATA box all influenced the degree of tax-mediated activation. Tax is also able to activate gene expression of other viral and cellular promoters. Tax activation of the IL-2 receptor and the HIV-1 LTR is mediated through NF-kappa B motifs. Interestingly, sequences in the 21-bp repeat B and C motifs contain significant homology with NF-kappa B regulatory elements. We demonstrated that an NF-kappa B binding protein, PRDII-BF1, but not the rel protein, bound to the B and C motifs in the 21-bp repeat. PRDII-BF1 was also able to stimulate activation of HTLV-I gene expression by tax. The role of the Ets proteins on modulating tax activation was also studied. Ets 1 but not Ets 2 was capable of increasing the degree of tax activation of the HTLV-I LTR. These results suggest that tax activates gene expression by either direct or indirect interaction with several cellular transcription factors that bind to the HTLV-I LTR.
Bayesian Inference of Allele-Specific Gene Expression Indicates Abundant Cis-Regulatory Variation in Natural Flycatcher Populations

PubMed Central

Wang, Mi

2017-01-01

Abstract Polymorphism in cis-regulatory sequences can lead to different levels of expression for the two alleles of a gene, providing a starting point for the evolution of gene expression. Little is known about the genome-wide abundance of genetic variation in gene regulation in natural populations but analysis of allele-specific expression (ASE) provides a means for investigating such variation. We performed RNA-seq of multiple tissues from population samples of two closely related flycatcher species and developed a Bayesian algorithm that maximizes data usage by borrowing information from the whole data set and combines several SNPs per transcript to detect ASE. Of 2,576 transcripts analyzed in collared flycatcher, ASE was detected in 185 (7.2%) and a similar frequency was seen in the pied flycatcher. Transcripts with statistically significant ASE commonly showed the major allele in >90% of the reads, reflecting that power was highest when expression was heavily biased toward one of the alleles. This would suggest that the observed frequencies of ASE likely are underestimates. The proportion of ASE transcripts varied among tissues, being lowest in testis and highest in muscle. Individuals often showed ASE of particular transcripts in more than one tissue (73.4%), consistent with a genetic basis for regulation of gene expression. The results suggest that genetic variation in regulatory sequences commonly affects gene expression in natural populations and that it provides a seedbed for phenotypic evolution via divergence in gene expression. PMID:28453623
Induction of lateral lumens through disruption of a monoleucine-based basolateral-sorting motif in betacellulin

PubMed Central

Singh, Bhuminder; Bogatcheva, Galina; Starchenko, Alina; Sinnaeve, Justine; Lapierre, Lynne A.; Williams, Janice A.; Goldenring, James R.; Coffey, Robert J.

2015-01-01

ABSTRACT Directed delivery of EGF receptor (EGFR) ligands to the apical or basolateral surface is a crucial regulatory step in the initiation of EGFR signaling in polarized epithelial cells. Herein, we show that the EGFR ligand betacellulin (BTC) is preferentially sorted to the basolateral surface of polarized MDCK cells. By using sequential truncations and site-directed mutagenesis within the BTC cytoplasmic domain, combined with selective cell-surface biotinylation and immunofluorescence, we have uncovered a monoleucine-based basolateral-sorting motif (EExxxL, specifically 156EEMETL161). Disruption of this sorting motif led to equivalent apical and basolateral localization of BTC. Unlike other EGFR ligands, BTC mistrafficking induced formation of lateral lumens in polarized MDCK cells, and this process was significantly attenuated by inhibition of EGFR. Additionally, expression of a cancer-associated somatic BTC mutation (E156K) led to BTC mistrafficking and induced lateral lumens in MDCK cells. Overexpression of BTC, especially mistrafficking forms, increased the growth of MDCK cells. These results uncover a unique role for BTC mistrafficking in promoting epithelial reorganization. PMID:26272915
The SRE Motif in the Human PNPLA3 Promoter (-97 to -88 bp) Mediates Transactivational Effects of SREBP-1c.

PubMed

Liang, Hua; Xu, Jing; Xu, Fen; Liu, Hongxia; Yuan, Ding; Yuan, Shuhua; Cai, Mengyin; Yan, Jinhua; Weng, Jianping

2015-09-01

Patatin-like phospholipase domain containing 3 (PNPLA3) is a non-secreted protein primarily expressed in liver and adipose tissue. Recently, numerous genetic studies have shown that PNPLA3 is a major susceptibility gene for nonalcoholic fatty liver disease (NAFLD). However, the mechanism involved in transcriptional regulation of the PNPLA3 gene remains unknown. We performed a detailed analysis of the human PNPLA3 gene promoter and identified two novel cis-acting elements (SRE and NFY binding motifs) located at -97/-88 and -26/-22 bp, respectively. Overexpression of SREBP-1c in HepG2 cells significantly increased PNPLA3 promoter activity. Mutation of either of the putative SRE or NFY binding motifs blocked the transactivation effects of SREBP-1c on the promoter. Overexpression of SREBP-1c and NFY together increased PNPLA3 promoter activity twice as much as that of SREBP-1c or NFY expression alone. This result suggests that SREBP-1c and NFY synergistically transactivate the human PNPLA3 gene. The ability of SREBP-1c and NFY to bind these cis-elements was confirmed using gel shift analysis. Putative SRE and NFY motifs also mediated synergistic insulin-induced transactivation of the PNPLA3 promoter in HepG2 cells. Additionally, the ability of SREBP-1c to bind to the PNPLA3 promoter was increased by insulin in a dose-dependent manner. Moreover, the treatment of HepG2 cells with the PI3K inhibitor LY294002 led to reduced insulin promoter-activating ability accompanied by a decrease in PNPLA3 and SREBP-1c protein expression. These results demonstrate that SREBP-1c is a direct activator of the human PNPLA3 gene and insulin transactivates the PNPLA3 gene via the PI3K-SREBP-1c/NFY pathway in HepG2 cells. © 2015 Wiley Periodicals, Inc.
Cis-Regulatory Variants Affect CHRNA5 mRNA Expression in Populations of African and European Ancestry

PubMed Central

Wang, Jen-Chyong; Spiegel, Noah; Bertelsen, Sarah; Le, Nhung; McKenna, Nicholas; Budde, John P.; Harari, Oscar; Kapoor, Manav; Brooks, Andrew; Hancock, Dana; Tischfield, Jay; Foroud, Tatiana; Bierut, Laura J.; Steinbach, Joe Henry; Edenberg, Howard J.; Traynor, Bryan J.; Goate, Alison M.

2013-01-01

Variants within the gene cluster encoding α3, α5, and β4 nicotinic receptor subunits are major risk factors for substance dependence. The strongest impact on risk is associated with variation in the CHRNA5 gene, where at least two mechanisms are at work: amino acid variation and altered mRNA expression levels. The risk allele of the non-synonymous variant (rs16969968; D398N) primarily occurs on the haplotype containing the low mRNA expression allele. In populations of European ancestry, there are approximately 50 highly correlated variants in the CHRNA5-CHRNA3-CHRNB4 gene cluster and the adjacent PSMA4 gene region that are associated with CHRNA5 mRNA levels. It is not clear which of these variants contribute to the changes in CHRNA5 transcript level. Because populations of African ancestry have reduced linkage disequilibrium among variants spanning this gene cluster, eQTL mapping in subjects of African ancestry could potentially aid in defining the functional variants that affect CHRNA5 mRNA levels. We performed quantitative allele specific gene expression using frontal cortices derived from 49 subjects of African ancestry and 111 subjects of European ancestry. This method measures allele-specific transcript levels in the same individual, which eliminates other biological variation that occurs when comparing expression levels between different samples. This analysis confirmed that substance dependence associated variants have a direct cis-regulatory effect on CHRNA5 transcript levels in human frontal cortices of African and European ancestry and identified 10 highly correlated variants, located in a 9 kb region, that are potential functional variants modifying CHRNA5 mRNA expression levels. PMID:24303001
[Personal motif in art].

PubMed

Gerevich, József

2015-01-01

One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.
Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

DOE PAGES

Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; ...

2016-03-26

Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less

Directed evolution of toluene dioxygenase from Pseudomonas putida for improved selectivity toward cis-indandiol during indene bioconversion.

PubMed

Zhang, N; Stewart, B G; Moore, J C; Greasham, R L; Robinson, D K; Buckland, B C; Lee, C

2000-10-01

Toluene dioxygenase (TDO) from Pseudomonas putida F1 converts indene to a mixture of cis-indandiol (racemic), 1-indenol, and 1-indanone. The desired product, cis-(1S,2R)-indandiol, is a potential key intermediate in the chemical synthesis of indinavir sulfate (Crixivan), Merck's HIV-1 protease inhibitor for the treatment of AIDS. To reduce the undesirable byproducts 1-indenol and 1-indanone formed during indene bioconversion, the recombinant TDO expressed in Escherichia coli was evolved by directed evolution using the error-prone polymerase chain reaction (epPCR) method. High-throughput fluorometric and spectrophotometric assays were developed for rapid screening of the mutant libraries in a 96-well format. Mutants with reduced 1-indenol by-product formation were identified, and the individual indene bioconversion product profiles of the selected mutants were confirmed by HPLC. Changes in the amino acid sequence of the mutant enzymes were identified by analyzing the nucleotide sequence of the genes. A mutant with the most desirable product profile from each library, defined as the most reduced 1-indenol concentration and with the highest cis-(1S,2R)-indandiol enantiomeric excess, was used to perform each subsequent round of mutagenesis. After three rounds of mutagenesis and screening, mutant 1C4-3G was identified to have a threefold reduction in 1-indenol formation over the wild type (20% vs 60% of total products) and a 40% increase of product (cis-indandiol) yield.
Computational exploration of cis-regulatory modules in rhythmic expression data using the "Exploration of Distinctive CREs and CRMs" (EDCC) and "CRM Network Generator" (CNG) programs.

PubMed

Bekiaris, Pavlos Stephanos; Tekath, Tobias; Staiger, Dorothee; Danisman, Selahattin

2018-01-01

Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, "Exploration of Distinctive CREs and CRMs" (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, "CRM Network Generator" (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression.
Ciliary dyslexia candidate genes DYX1C1 and DCDC2 are regulated by Regulatory Factor X (RFX) transcription factors through X-box promoter motifs

PubMed Central

Tammimies, Kristiina; Bieder, Andrea; Lauter, Gilbert; Sugiaman-Trapman, Debora; Torchet, Rachel; Hokkanen, Marie-Estelle; Burghoorn, Jan; Castrén, Eero; Kere, Juha; Tapia-Páez, Isabel; Swoboda, Peter

2016-01-01

DYX1C1, DCDC2, and KIAA0319 are three of the most replicated dyslexia candidate genes (DCGs). Recently, these DCGs were implicated in functions at the cilium. Here, we investigate the regulation of these DCGs by Regulatory Factor X transcription factors (RFX TFs), a gene family known for transcriptionally regulating ciliary genes. We identify conserved X-box motifs in the promoter regions of DYX1C1, DCDC2, and KIAA0319 and demonstrate their functionality, as well as the ability to recruit RFX TFs using reporter gene and electrophoretic mobility shift assays. Furthermore, we uncover a complex regulation pattern between RFX1, RFX2, and RFX3 and their significant effect on modifying the endogenous expression of DYX1C1 and DCDC2 in a human retinal pigmented epithelial cell line immortalized with hTERT (hTERT-RPE1). In addition, induction of ciliogenesis increases the expression of RFX TFs and DCGs. At the protein level, we show that endogenous DYX1C1 localizes to the base of the cilium, whereas DCDC2 localizes along the entire axoneme of the cilium, thereby validating earlier localization studies using overexpression models. Our results corroborate the emerging role of DCGs in ciliary function and characterize functional noncoding elements, X-box promoter motifs, in DCG promoter regions, which thus can be targeted for mutation screening in dyslexia and ciliopathies associated with these genes.—Tammimies, K., Bieder, A., Lauter, G., Sugiaman-Trapman, D., Torchet, R., Hokkanen, M.-E., Burghoorn, J., Castrén, E., Kere, J., Tapia-Páez, I., Swoboda, P. Ciliary dyslexia candidate genes DYX1C1 and DCDC2 are regulated by Regulatory Factor (RF) X transcription factors through X-box promoter motifs. PMID:27451412
Ciliary dyslexia candidate genes DYX1C1 and DCDC2 are regulated by Regulatory Factor X (RFX) transcription factors through X-box promoter motifs.

PubMed

Tammimies, Kristiina; Bieder, Andrea; Lauter, Gilbert; Sugiaman-Trapman, Debora; Torchet, Rachel; Hokkanen, Marie-Estelle; Burghoorn, Jan; Castrén, Eero; Kere, Juha; Tapia-Páez, Isabel; Swoboda, Peter

2016-10-01

DYX1C1, DCDC2, and KIAA0319 are three of the most replicated dyslexia candidate genes (DCGs). Recently, these DCGs were implicated in functions at the cilium. Here, we investigate the regulation of these DCGs by Regulatory Factor X transcription factors (RFX TFs), a gene family known for transcriptionally regulating ciliary genes. We identify conserved X-box motifs in the promoter regions of DYX1C1, DCDC2, and KIAA0319 and demonstrate their functionality, as well as the ability to recruit RFX TFs using reporter gene and electrophoretic mobility shift assays. Furthermore, we uncover a complex regulation pattern between RFX1, RFX2, and RFX3 and their significant effect on modifying the endogenous expression of DYX1C1 and DCDC2 in a human retinal pigmented epithelial cell line immortalized with hTERT (hTERT-RPE1). In addition, induction of ciliogenesis increases the expression of RFX TFs and DCGs. At the protein level, we show that endogenous DYX1C1 localizes to the base of the cilium, whereas DCDC2 localizes along the entire axoneme of the cilium, thereby validating earlier localization studies using overexpression models. Our results corroborate the emerging role of DCGs in ciliary function and characterize functional noncoding elements, X-box promoter motifs, in DCG promoter regions, which thus can be targeted for mutation screening in dyslexia and ciliopathies associated with these genes.-Tammimies, K., Bieder, A., Lauter, G., Sugiaman-Trapman, D., Torchet, R., Hokkanen, M.-E., Burghoorn, J., Castrén, E., Kere, J., Tapia-Páez, I., Swoboda, P. Ciliary dyslexia candidate genes DYX1C1 and DCDC2 are regulated by Regulatory Factor (RF) X transcription factors through X-box promoter motifs. © The Author(s).
Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

PubMed Central

Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

2005-01-01

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Sequence and functional characterization of MIRNA164 promoters from Brassica shows copy number dependent regulatory diversification among homeologs.

PubMed

Jain, Aditi; Anand, Saurabh; Singh, Neer K; Das, Sandip

2018-03-12

The impact of polyploidy on functional diversification of cis-regulatory elements is poorly understood. This is primarily on account of lack of well-defined structure of cis-elements and a universal regulatory code. To the best of our knowledge, this is the first report on characterization of sequence and functional diversification of paralogous and homeologous promoter elements associated with MIR164 from Brassica. The availability of whole genome sequence allowed us to identify and isolate a total of 42 homologous copies of MIR164 from diploid species-Brassica rapa (A-genome), Brassica nigra (B-genome), Brassica oleracea (C-genome), and allopolyploids-Brassica juncea (AB-genome), Brassica carinata (BC-genome) and Brassica napus (AC-genome). Additionally, we retrieved homologous sequences based on comparative genomics from Arabidopsis lyrata, Capsella rubella, and Thellungiella halophila, spanning ca. 45 million years of evolutionary history of Brassicaceae. Sequence comparison across Brassicaceae revealed lineage-, karyotype, species-, and sub-genome specific changes providing a snapshot of evolutionary dynamics of miRNA promoters in polyploids. Tree topology of cis-elements associated with MIR164 was found to re-capitulate the species and family evolutionary history. Phylogenetic shadowing identified transcription factor binding sites (TFBS) conserved across Brassicaceae, of which, some are already known as regulators of MIR164 expression. Some of the TFBS were found to be distributed in a sub-genome specific (e.g., SOX specific to promoter of MIR164c from MF2 sub-genome), lineage-specific (YABBY binding motif, specific to C. rubella in MIR164b), or species-specific (e.g., VOZ in A. thaliana MIR164a) manner which might contribute towards genetic and adaptive variation. Reporter activity driven by promoters associated with MIR164 paralogs and homeologs was majorly in agreement with known role of miR164 in leaf shaping, regulation of lateral root development and
Transcriptome landscape of Lactococcus lactis reveals many novel RNAs including a small regulatory RNA involved in carbon uptake and metabolism.

PubMed

van der Meulen, Sjoerd B; de Jong, Anne; Kok, Jan

2016-01-01

RNA sequencing has revolutionized genome-wide transcriptome analyses, and the identification of non-coding regulatory RNAs in bacteria has thus increased concurrently. Here we reveal the transcriptome map of the lactic acid bacterial paradigm Lactococcus lactis MG1363 by employing differential RNA sequencing (dRNA-seq) and a combination of manual and automated transcriptome mining. This resulted in a high-resolution genome annotation of L. lactis and the identification of 60 cis-encoded antisense RNAs (asRNAs), 186 trans-encoded putative regulatory RNAs (sRNAs) and 134 novel small ORFs. Based on the putative targets of asRNAs, a novel classification is proposed. Several transcription factor DNA binding motifs were identified in the promoter sequences of (a)sRNAs, providing insight in the interplay between lactococcal regulatory RNAs and transcription factors. The presence and lengths of 14 putative sRNAs were experimentally confirmed by differential Northern hybridization, including the abundant RNA 6S that is differentially expressed depending on the available carbon source. For another sRNA, LLMGnc_147, functional analysis revealed that it is involved in carbon uptake and metabolism. L. lactis contains 13% leaderless mRNAs (lmRNAs) that, from an analysis of overrepresentation in GO classes, seem predominantly involved in nucleotide metabolism and DNA/RNA binding. Moreover, an A-rich sequence motif immediately following the start codon was uncovered, which could provide novel insight in the translation of lmRNAs. Altogether, this first experimental genome-wide assessment of the transcriptome landscape of L. lactis and subsequent sRNA studies provide an extensive basis for the investigation of regulatory RNAs in L. lactis and related lactococcal species.
NF-Y Binding Site Architecture Defines a C-Fos Targeted Promoter Class

PubMed Central

Haubrock, Martin; Hartmann, Fabian; Wingender, Edgar

2016-01-01

ChIP-seq experiments detect the chromatin occupancy of known transcription factors in a genome-wide fashion. The comparisons of several species-specific ChIP-seq libraries done for different transcription factors have revealed a complex combinatorial and context-specific co-localization behavior for the identified binding regions. In this study we have investigated human derived ChIP-seq data to identify common cis-regulatory principles for the human transcription factor c-Fos. We found that in four different cell lines, c-Fos targeted proximal and distal genomic intervals show prevalences for either AP-1 motifs or CCAAT boxes as known binding motifs for the transcription factor NF-Y, and thereby act in a mutually exclusive manner. For proximal regions of co-localized c-Fos and NF-YB binding, we gathered evidence that a characteristic configuration of repeating CCAAT motifs may be responsible for attracting c-Fos, probably provided by a nearby AP-1 bound enhancer. Our results suggest a novel regulatory function of NF-Y in gene-proximal regions. Specific CCAAT dimer repeats bound by the transcription factor NF-Y define this novel cis-regulatory module. Based on this behavior we propose a new enhancer promoter interaction model based on AP-1 motif defined enhancers which interact with CCAAT-box characterized promoter regions. PMID:27517874
Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

PubMed

Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

2017-03-17

Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
MotifNet: a web-server for network motif analysis.

PubMed

Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

2017-06-15

Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Computation of direct and inverse mutations with the SEGM web server (Stochastic Evolution of Genetic Motifs): an application to splice sites of human genome introns.

PubMed

Benard, Emmanuel; Michel, Christian J

2009-08-01

We present here the SEGM web server (Stochastic Evolution of Genetic Motifs) in order to study the evolution of genetic motifs both in the direct evolutionary sense (past-present) and in the inverse evolutionary sense (present-past). The genetic motifs studied can be nucleotides, dinucleotides and trinucleotides. As an example of an application of SEGM and to understand its functionalities, we give an analysis of inverse mutations of splice sites of human genome introns. SEGM is freely accessible at http://lsiit-bioinfo.u-strasbg.fr:8080/webMathematica/SEGM/SEGM.html directly or by the web site http://dpt-info.u-strasbg.fr/~michel/. To our knowledge, this SEGM web server is to date the only computational biology software in this evolutionary approach.
The Membrane-Bound NAC Transcription Factor ANAC013 Functions in Mitochondrial Retrograde Regulation of the Oxidative Stress Response in Arabidopsis[C][W

PubMed Central

De Clercq, Inge; Vermeirssen, Vanessa; Van Aken, Olivier; Vandepoele, Klaas; Murcha, Monika W.; Law, Simon R.; Inzé, Annelies; Ng, Sophia; Ivanova, Aneta; Rombaut, Debbie; van de Cotte, Brigitte; Jaspers, Pinja; Van de Peer, Yves; Kangasjärvi, Jaakko; Whelan, James; Van Breusegem, Frank

2013-01-01

Upon disturbance of their function by stress, mitochondria can signal to the nucleus to steer the expression of responsive genes. This mitochondria-to-nucleus communication is often referred to as mitochondrial retrograde regulation (MRR). Although reactive oxygen species and calcium are likely candidate signaling molecules for MRR, the protein signaling components in plants remain largely unknown. Through meta-analysis of transcriptome data, we detected a set of genes that are common and robust targets of MRR and used them as a bait to identify its transcriptional regulators. In the upstream regions of these mitochondrial dysfunction stimulon (MDS) genes, we found a cis-regulatory element, the mitochondrial dysfunction motif (MDM), which is necessary and sufficient for gene expression under various mitochondrial perturbation conditions. Yeast one-hybrid analysis and electrophoretic mobility shift assays revealed that the transmembrane domain–containing NO APICAL MERISTEM/ARABIDOPSIS TRANSCRIPTION ACTIVATION FACTOR/CUP-SHAPED COTYLEDON transcription factors (ANAC013, ANAC016, ANAC017, ANAC053, and ANAC078) bound to the MDM cis-regulatory element. We demonstrate that ANAC013 mediates MRR-induced expression of the MDS genes by direct interaction with the MDM cis-regulatory element and triggers increased oxidative stress tolerance. In conclusion, we characterized ANAC013 as a regulator of MRR upon stress in Arabidopsis thaliana. PMID:24045019
Motifs, modules and games in bacteria.

PubMed

Wolf, Denise M; Arkin, Adam P

2003-04-01

Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment. Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.
Definition of Cis-Acting Elements Regulating Expression of the Drosophila Melanogaster Ninae Opsin Gene by Oligonucleotide-Directed Mutagenesis

PubMed Central

Mismer, D.; Rubin, G. M.

1989-01-01

We have analyzed the cis-acting regulatory sequences of the Rh1 (ninaE) gene in Drosophila melanogaster by P-element-mediated germline transformation of indicator genes transcribed from mutant ninaE promoter sequences. We have previously shown that a 200-bp region extending from -120 to +67 relative to the transcription start site is sufficient to obtain eye-specific expression from the ninaE promoter. In the present study, 22 different 4-13-bp sequences in the -120/+67 promoter region were altered by oligonucleotide-directed mutagenesis. Several of these sequences were found to be required for proper promoter function; two of these are conserved in the promoter of the homologous gene isolated from the related species Drosophila virilis. Alteration of a conserved 9-bp sequence results in aberrant, low level expression in the body. Alteration of a separate 11-bp sequence, found in the promoter regions of several photoreceptor-specific genes of Drosophila, results in an approximately 15-fold reduction in promoter efficiency but without apparent alteration of tissue-specificity. A protein factor capable of interacting with this 11-bp sequence has been detected by DNaseI footprinting in embryonic nuclear extracts. Finally, we have further characterized two separable enhancer sequences previously shown to be required for normal levels of expression from this promoter. PMID:2521839
BayesMotif: de novo protein sorting motif discovery from impure datasets.

PubMed

Hu, Jianjun; Zhang, Fan

2010-01-18

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of
Interconnected network motifs control podocyte morphology and kidney function.

PubMed

Azeloglu, Evren U; Hardy, Simon V; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y; Fang, Wei; Xiong, Huabao; Neves, Susana R; Jain, Mohit R; Li, Hong; Ma'ayan, Avi; Gordon, Ronald E; He, John Cijiang; Iyengar, Ravi

2014-02-04

Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3',5'-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element-binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor-driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease.
Interconnected Network Motifs Control Podocyte Morphology and Kidney Function

PubMed Central

Azeloglu, Evren U.; Hardy, Simon V.; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y.; Fang, Wei; Xiong, Huabao; Neves, Susana R.; Jain, Mohit R.; Li, Hong; Ma’ayan, Avi; Gordon, Ronald E.; He, John Cijiang; Iyengar, Ravi

2014-01-01

Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3′,5′-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element–binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor–driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease. PMID:24497609
Genomic Identification and Analysis of Shared Cis-regulator Elements in a Developmentally Critical homeobox Cluster

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chris Amemiya

2003-04-01

The goals of this project were to isolate, characterize, and sequence the Dlx3/Dlx7 bigene cluster from twelve different species of mammals. The Dlx3 and Dlx7 genes are known to encode homeobox transcription factors involved in patterning of structures in the vertebrate jaw as well as vertebrate limbs. Genomic sequences from the respective taxa will subsequently be compared in order to identify conserved non-coding sequences that are potential cis-regulatory elements. Based on the comparisons they will fashion transgenic mouse experiments to functionally test the strength of the potential cis-regulatory elements. A goal of the project is to attempt to identify thosemore » elements that may function in coordinately regulating both Dlx3 and Dlx7 functions.« less
Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

PubMed Central

Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

2013-01-01

Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
Motifs, modules and games in bacteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Denise M.; Arkin, Adam P.

2003-04-01

Global explorations of regulatory network dynamics, organization and evolution have become tractable thanks to high-throughput sequencing and molecular measurement of bacterial physiology. From these, a nascent conceptual framework is developing, that views the principles of regulation in term of motifs, modules and games. Motifs are small, repeated, and conserved biological units ranging from molecular domains to small reaction networks. They are arranged into functional modules, genetically dissectible cellular functions such as the cell cycle, or different stress responses. The dynamical functioning of modules defines the organism's strategy to survive in a game, pitting cell against cell, and cell against environment.more » Placing pathway structure and dynamics into an evolutionary context begins to allow discrimination between those physical and molecular features that particularize a species to its surroundings, and those that provide core physiological function. This approach promises to generate a higher level understanding of cellular design, pathway evolution and cellular bioengineering.« less

Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation

PubMed Central

van Hijum, Sacha A. F. T.; Medema, Marnix H.; Kuipers, Oscar P.

2009-01-01

Summary: A major part of organismal complexity and versatility of prokaryotes resides in their ability to fine-tune gene expression to adequately respond to internal and external stimuli. Evolution has been very innovative in creating intricate mechanisms by which different regulatory signals operate and interact at promoters to drive gene expression. The regulation of target gene expression by transcription factors (TFs) is governed by control logic brought about by the interaction of regulators with TF binding sites (TFBSs) in cis-regulatory regions. A factor that in large part determines the strength of the response of a target to a given TF is motif stringency, the extent to which the TFBS fits the optimal TFBS sequence for a given TF. Advances in high-throughput technologies and computational genomics allow reconstruction of transcriptional regulatory networks in silico. To optimize the prediction of transcriptional regulatory networks, i.e., to separate direct regulation from indirect regulation, a thorough understanding of the control logic underlying the regulation of gene expression is required. This review summarizes the state of the art of the elements that determine the functionality of TFBSs by focusing on the molecular biological mechanisms and evolutionary origins of cis-regulatory regions. PMID:19721087
Diverse activities of viral cis-acting RNA regulatory elements revealed using multicolor, long-term, single-cell imaging.

PubMed

Pocock, Ginger M; Zimdars, Laraine L; Yuan, Ming; Eliceiri, Kevin W; Ahlquist, Paul; Sherer, Nathan M

2017-02-01

Cis-acting RNA structural elements govern crucial aspects of viral gene expression. How these structures and other posttranscriptional signals affect RNA trafficking and translation in the context of single cells is poorly understood. Herein we describe a multicolor, long-term (>24 h) imaging strategy for measuring integrated aspects of viral RNA regulatory control in individual cells. We apply this strategy to demonstrate differential mRNA trafficking behaviors governed by RNA elements derived from three retroviruses (HIV-1, murine leukemia virus, and Mason-Pfizer monkey virus), two hepadnaviruses (hepatitis B virus and woodchuck hepatitis virus), and an intron-retaining transcript encoded by the cellular NXF1 gene. Striking behaviors include "burst" RNA nuclear export dynamics regulated by HIV-1's Rev response element and the viral Rev protein; transient aggregations of RNAs into discrete foci at or near the nuclear membrane triggered by multiple elements; and a novel, pulsiform RNA export activity regulated by the hepadnaviral posttranscriptional regulatory element. We incorporate single-cell tracking and a data-mining algorithm into our approach to obtain RNA element-specific, high-resolution gene expression signatures. Together these imaging assays constitute a tractable, systems-based platform for studying otherwise difficult to access spatiotemporal features of viral and cellular gene regulation. © 2017 Pocock et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
cis-Proline-mediated Ser(P)[superscript 5] Dephosphorylation by the RNA Polymerase II C-terminal Domain Phosphatase Ssu72

DOE Office of Scientific and Technical Information (OSTI.GOV)

Werner-Allen, Jon W.; Lee, Chul-Jin; Liu, Pengda

2012-05-16

RNA polymerase II coordinates co-transcriptional events by recruiting distinct sets of nuclear factors to specific stages of transcription via changes of phosphorylation patterns along its C-terminal domain (CTD). Although it has become increasingly clear that proline isomerization also helps regulate CTD-associated processes, the molecular basis of its role is unknown. Here, we report the structure of the Ser(P){sup 5} CTD phosphatase Ssu72 in complex with substrate, revealing a remarkable CTD conformation with the Ser(P){sup 5}-Pro{sup 6} motif in the cis configuration. We show that the cis-Ser(P){sup 5}-Pro{sup 6} isomer is the minor population in solution and that Ess1-catalyzed cis-trans-proline isomerizationmore » facilitates rapid dephosphorylation by Ssu72, providing an explanation for recently discovered in vivo connections between these enzymes and a revised model for CTD-mediated small nuclear RNA termination. This work presents the first structural evidence of a cis-proline-specific enzyme and an unexpected mechanism of isomer-based regulation of phosphorylation, with broad implications for CTD biology« less
Murine homeobox-containing gene, Msx-1: analysis of genomic organization, promoter structure, and potential autoregulatory cis-acting elements.

PubMed

Kuzuoka, M; Takahashi, T; Guron, C; Raghow, R

1994-05-01

Detailed molecular organization of the coding and upstream regulatory regions of the murine homeodomain-containing gene, Msx-1, is reported. The protein-encoding portion of the gene is contained in two exons, 590 and 1214 bp in length, separated by a 2107-bp intron; the homeodomain is located in the second exon. The two-exon organization of the murine Msx-1 gene resembles a number of other homeodomain-containing genes. The 5'-(GTAAGT) and 3'-(CCCTAG) splicing junctions and the mRNA polyadenylation signal (UAUAA) of the murine Msx-1 gene are also characteristic of other vertebrate genes. By nuclease protection and primer extension assays, the start of transcription of the Msx-1 gene was located 256 bp upstream of the first AUG. Computer analysis of the promoter proximal 1280-bp sequence revealed a number of potentially important cis-regulatory sequences; these include the recognition elements for Ap-1, Ap-2, Ap-3, Sp-1, a possible binding site for RAR:RXR, and a number of TCF-1 consensus motifs. Importantly, a perfect reverse complement of (C/G)TTAATTG, which was recently shown to be an optimal binding sequence for the homeodomain of Msx-1 protein (K.M. Catron, N. Iler, and C. Abate (1993) Mol. Cell. Biol. 13:2354-2365), was also located in the murine Msx-1 promoter. Binding of bacterially expressed Msx-1 homeodomain polypeptide to Msx-1-specific oligonucleotide was experimentally demonstrated, raising a distinct possibility of autoregulation of this developmentally regulated gene.
Database construction for PromoterCAD: synthetic promoter design for mammals and plants.

PubMed

Nishikata, Koro; Cox, Robert Sidney; Shimoyama, Sayoko; Yoshida, Yuko; Matsui, Minami; Makita, Yuko; Toyoda, Tetsuro

2014-03-21

Synthetic promoters can control a gene's timing, location, and expression level. The PromoterCAD web server ( http://promotercad.org ) allows the design of synthetic promoters to control plant gene expression, by novel arrangement of cis-regulatory elements. Recently, we have expanded PromoterCAD's scope with additional plant and animal data: (1) PLACE (Plant Cis-acting Regulatory DNA Elements), including various sized sequence motifs; (2) PEDB (Mammalian Promoter/Enhancer Database), including gene expression data for mammalian tissues. The plant PromoterCAD data now contains 22 000 Arabidopsis thaliana genes, 2 200 000 microarray measurements in 20 growth conditions and 79 tissue organs and developmental stages, while the new mammalian PromoterCAD data contains 679 Mus musculus genes and 65 000 microarray measurements in 96 tissue organs and cell types ( http://promotercad.org/mammal/ ). This work presents step-by-step instructions for adding both regulatory motif and gene expression data to PromoterCAD, to illustrate how users can expand PromoterCAD functionality for their own applications and organisms.
Viral infection upregulates myostatin promoter activity in orange-spotted grouper (Epinephelus coioides)

PubMed Central

Chen, Yi-Tien; Lin, Chao-Fen; Chen, Young-Mao; Lo, Chih-En; Chen, Wan-Erh

2017-01-01

Myostatin is a negative regulator of myogenesis and has been suggested to be an important factor in the development of muscle wasting during viral infection. The objective of this study was to characterize the main regulatory element of the grouper myostatin promoter and to study changes in promoter activity due to viral stimulation. In vitro and in vivo experiments indicated that the E-box E6 is a positive cis-and trans-regulation motif, and an essential binding site for MyoD. In contrast, the E-box E5 is a dominant negative cis-regulatory. The characteristics of grouper myostatin promoter are similar in regulation of muscle growth to that of other species, but mainly through specific regulatory elements. According to these results, we conducted a study to investigate the effect of viral infection on myostatin promoter activity and its regulation. The nervous necrosis virus (NNV) treatment significantly induced myostatin promoter activity. The present study is the first report describing that specific myostatin motifs regulate promoter activity and response to viral infection. PMID:29036192
Viral infection upregulates myostatin promoter activity in orange-spotted grouper (Epinephelus coioides).

PubMed

Chen, Yi-Tien; Lin, Chao-Fen; Chen, Young-Mao; Lo, Chih-En; Chen, Wan-Erh; Chen, Tzong-Yueh

2017-01-01

Myostatin is a negative regulator of myogenesis and has been suggested to be an important factor in the development of muscle wasting during viral infection. The objective of this study was to characterize the main regulatory element of the grouper myostatin promoter and to study changes in promoter activity due to viral stimulation. In vitro and in vivo experiments indicated that the E-box E6 is a positive cis-and trans-regulation motif, and an essential binding site for MyoD. In contrast, the E-box E5 is a dominant negative cis-regulatory. The characteristics of grouper myostatin promoter are similar in regulation of muscle growth to that of other species, but mainly through specific regulatory elements. According to these results, we conducted a study to investigate the effect of viral infection on myostatin promoter activity and its regulation. The nervous necrosis virus (NNV) treatment significantly induced myostatin promoter activity. The present study is the first report describing that specific myostatin motifs regulate promoter activity and response to viral infection.
PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants.

PubMed

Jin, Jinpu; Tian, Feng; Yang, De-Chang; Meng, Yu-Qi; Kong, Lei; Luo, Jingchu; Gao, Ge

2017-01-04

With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Regulatory sequence analysis tools.

PubMed

van Helden, Jacques

2003-07-01

The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.
Identification and role of regulatory non-coding RNAs in Listeria monocytogenes.

PubMed

Izar, Benjamin; Mraheil, Mobarak Abu; Hain, Torsten

2011-01-01

Bacterial regulatory non-coding RNAs control numerous mRNA targets that direct a plethora of biological processes, such as the adaption to environmental changes, growth and virulence. Recently developed high-throughput techniques, such as genomic tiling arrays and RNA-Seq have allowed investigating prokaryotic cis- and trans-acting regulatory RNAs, including sRNAs, asRNAs, untranslated regions (UTR) and riboswitches. As a result, we obtained a more comprehensive view on the complexity and plasticity of the prokaryotic genome biology. Listeria monocytogenes was utilized as a model system for intracellular pathogenic bacteria in several studies, which revealed the presence of about 180 regulatory RNAs in the listerial genome. A regulatory role of non-coding RNAs in survival, virulence and adaptation mechanisms of L. monocytogenes was confirmed in subsequent experiments, thus, providing insight into a multifaceted modulatory function of RNA/mRNA interference. In this review, we discuss the identification of regulatory RNAs by high-throughput techniques and in their functional role in L. monocytogenes.
Computational exploration of cis-regulatory modules in rhythmic expression data using the “Exploration of Distinctive CREs and CRMs” (EDCC) and “CRM Network Generator” (CNG) programs

PubMed Central

Staiger, Dorothee

2018-01-01

Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, “Exploration of Distinctive CREs and CRMs” (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, “CRM Network Generator” (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression. PMID:29298348
A flexible motif search technique based on generalized profiles.

PubMed

Bucher, P; Karplus, K; Moeri, N; Hofmann, K

1996-03-01

A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.
Triadic motifs in the dependence networks of virtual societies.

PubMed

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-10

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

NASA Astrophysics Data System (ADS)

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

PubMed Central

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-01-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755
Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

PubMed Central

Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

2009-01-01

Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary
DLocalMotif: a discriminative approach for discovering local motifs in protein sequences.

PubMed

Mehdi, Ahmed M; Sehgal, Muhammad Shoaib B; Kobe, Bostjan; Bailey, Timothy L; Bodén, Mikael

2013-01-01

Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery. This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties. http://bioinf.scmb.uq.edu.au/dlocalmotif/
Identification of a distant cis-regulatory element controlling pharyngeal arch-specific expression of zebrafish gdf6a/radar

PubMed Central

Reed, Nykolaus P.; Mortlock, Douglas P.

2011-01-01

Skeletal formation is an essential and intricately regulated part of vertebrate development. Humans and mice deficient in Growth and Differentiation Factor 6 (Gdf6) have numerous skeletal abnormalities including joint fusions and cartilage reductions. The expression of Gdf6 is dynamic and in part regulated by distant evolutionarily conserved cis-regulatory elements. radar/gdf6a is a zebrafish ortholog of Gdf6 and has an essential role in embryonic patterning. Here we show that radar is transcribed in the cells surrounding and between the developing cartilages of the ventral pharyngeal arches, similar to mouse Gdf6. A 312 bp evolutionarily conserved region (ECR5), 122 kilobases downstream, drives expression in a pharyngeal arch-specific manner similar to endogenous radar/gdf6a. Deletion analysis identified a 78 bp region within ECR5 that is essential for transgene activity. This work illustrates that radar is regulated in the pharyngeal arches by a distant conserved element and suggests radar has similar functions in skeletal development in fish and mammals. PMID:20201106
La-related protein 1 (LARP1) binds the mRNA cap, blocking eIF4F assembly on TOP mRNAs.

PubMed

Lahr, Roni M; Fonseca, Bruno D; Ciotti, Gabrielle E; Al-Ashtal, Hiba A; Jia, Jian-Jun; Niklaus, Marius R; Blagden, Sarah P; Alain, Tommy; Berman, Andrea J

2017-04-07

The 5'terminal oligopyrimidine (5'TOP) motif is a cis -regulatory RNA element located immediately downstream of the 7-methylguanosine [m 7 G] cap of TOP mRNAs, which encode ribosomal proteins and translation factors. In eukaryotes, this motif coordinates the synchronous and stoichiometric expression of the protein components of the translation machinery. La-related protein 1 (LARP1) binds TOP mRNAs, regulating their stability and translation. We present crystal structures of the human LARP1 DM15 region in complex with a 5'TOP motif, a cap analog (m 7 GTP), and a capped cytidine (m 7 GpppC), resolved to 2.6, 1.8 and 1.7 Å, respectively. Our binding, competition, and immunoprecipitation data corroborate and elaborate on the mechanism of 5'TOP motif binding by LARP1. We show that LARP1 directly binds the cap and adjacent 5'TOP motif of TOP mRNAs, effectively impeding access of eIF4E to the cap and preventing eIF4F assembly. Thus, LARP1 is a specialized TOP mRNA cap-binding protein that controls ribosome biogenesis.
Coordinated transcriptional regulation of two key genes in the lignin branch pathway - CAD and CCR - is mediated through MYB- binding sites

PubMed Central

2010-01-01

Background Cinnamoyl CoA reductase (CCR) and cinnamyl alcohol dehydrogenase (CAD) catalyze the final steps in the biosynthesis of monolignols, the monomeric units of the phenolic lignin polymers which confer rigidity, imperviousness and resistance to biodegradation to cell walls. We have previously shown that the Eucalyptus gunnii CCR and CAD2 promoters direct similar expression patterns in vascular tissues suggesting that monolignol production is controlled, at least in part, by the coordinated transcriptional regulation of these two genes. Although consensus motifs for MYB transcription factors occur in most gene promoters of the whole phenylpropanoid pathway, functional evidence for their contribution to promoter activity has only been demonstrated for a few of them. Here, in the lignin-specific branch, we studied the functional role of MYB elements as well as other cis-elements identified in the regulatory regions of EgCAD2 and EgCCR promoters, in the transcriptional activity of these gene promoters. Results By using promoter deletion analysis and in vivo footprinting, we identified an 80 bp regulatory region in the Eucalyptus gunnii EgCAD2 promoter that contains two MYB elements, each arranged in a distinct module with newly identified cis-elements. A directed mutagenesis approach was used to introduce block mutations in all putative cis-elements of the EgCAD2 promoter and in those of the 50 bp regulatory region previously delineated in the EgCCR promoter. We showed that the conserved MYB elements in EgCAD2 and EgCCR promoters are crucial both for the formation of DNA-protein complexes in EMSA experiments and for the transcriptional activation of EgCAD2 and EgCCR promoters in vascular tissues in planta. In addition, a new regulatory cis-element that modulates the balance between two DNA-protein complexes in vitro was found to be important for EgCAD2 expression in the cambial zone. Conclusions Our assignment of functional roles to the identified cis

13-cis retinoic acid and isomerisation in paediatric oncology--is changing shape the key to success?

PubMed

Armstrong, Jane L; Redfern, Christopher P F; Veal, Gareth J

2005-05-01

Retinoic acid isomers have been used with some success as chemotherapeutic agents, most recently with 13-cis retinoic acid showing impressive clinical efficacy in the paediatric malignancy neuroblastoma. The aim of this commentary is to review the evidence that 13-cis retinoic acid is a pro-drug, and consider the implications of retinoid metabolism and isomerisation for the further development of retinoic acid for cancer therapy. The low binding affinity of 13-cis retinoic acid for retinoic acid receptors, low activity in gene expression assays and the accumulation of the all-trans isomer in cells treated with 13-cis retinoic acid, coupled with the more-favourable pharmacokinetic profile of 13-cis retinoic acid compared to other isomers, suggest that intracellular isomerisation to all-trans retinoic acid is the key process underlying the biological activity of 13-cis retinoic acid. Intracellular metabolism of all-trans retinoic acid by a positive auto-regulatory loop may result in clinical resistance to retinoic acid. Agents that block or reduce the metabolism of all-trans retinoic acid are therefore attractive targets for drug development. Devising strategies to deliver 13-cis retinoic acid to tumour cells and facilitate the intracellular isomerisation of 13-cis retinoic acid, while limiting metabolism of all-trans retinoic acid, may have a major impact on the efficacy of 13-cis retinoic acid in paediatric oncology.
Neutral forces acting on intragenomic variability shape the Escherichia coli regulatory network topology.

PubMed

Ruths, Troy; Nakhleh, Luay

2013-05-07

Cis-regulatory networks (CRNs) play a central role in cellular decision making. Like every other biological system, CRNs undergo evolution, which shapes their properties by a combination of adaptive and nonadaptive evolutionary forces. Teasing apart these forces is an important step toward functional analyses of the different components of CRNs, designing regulatory perturbation experiments, and constructing synthetic networks. Although tests of neutrality and selection based on molecular sequence data exist, no such tests are currently available based on CRNs. In this work, we present a unique genotype model of CRNs that is grounded in a genomic context and demonstrate its use in identifying portions of the CRN with properties explainable by neutral evolutionary forces at the system, subsystem, and operon levels. We leverage our model against experimentally derived data from Escherichia coli. The results of this analysis show statistically significant and substantial neutral trends in properties previously identified as adaptive in origin--degree distribution, clustering coefficient, and motifs--within the E. coli CRN. Our model captures the tightly coupled genome-interactome of an organism and enables analyses of how evolutionary events acting at the genome level, such as mutation, and at the population level, such as genetic drift, give rise to neutral patterns that we can quantify in CRNs.
CstF-64 and 3'-UTR cis-element determine Star-PAP specificity for target mRNA selection by excluding PAPα.

PubMed

Kandala, Divya T; Mohan, Nimmy; A, Vivekanand; A P, Sudheesh; G, Reshmi; Laishram, Rakesh S

2016-01-29

Almost all eukaryotic mRNAs have a poly (A) tail at the 3'-end. Canonical PAPs (PAPα/γ) polyadenylate nuclear pre-mRNAs. The recent identification of the non-canonical Star-PAP revealed specificity of nuclear PAPs for pre-mRNAs, yet the mechanism how Star-PAP selects mRNA targets is still elusive. Moreover, how Star-PAP target mRNAs having canonical AAUAAA signal are not regulated by PAPα is unclear. We investigate specificity mechanisms of Star-PAP that selects pre-mRNA targets for polyadenylation. Star-PAP assembles distinct 3'-end processing complex and controls pre-mRNAs independent of PAPα. We identified a Star-PAP recognition nucleotide motif and showed that suboptimal DSE on Star-PAP target pre-mRNA 3'-UTRs inhibit CstF-64 binding, thus preventing PAPα recruitment onto it. Altering 3'-UTR cis-elements on a Star-PAP target pre-mRNA can switch the regulatory PAP from Star-PAP to PAPα. Our results suggest a mechanism of poly (A) site selection that has potential implication on the regulation of alternative polyadenylation. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

PubMed

Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

2017-02-01

An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.
The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

PubMed

Gaji, Rajshekhar Y; Howe, Daniel K

2009-07-01

The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Unique ζ-chain motifs mediate a direct TCR-actin linkage critical for immunological synapse formation and T-cell activation.

PubMed

Klieger, Yair; Almogi-Hazan, Osnat; Ish-Shalom, Eliran; Pato, Aviad; Pauker, Maor H; Barda-Saad, Mira; Wang, Lynn; Baniyash, Michal

2014-01-01

TCR-mediated activation induces receptor microclusters that evolve to a defined immune synapse (IS). Many studies showed that actin polymerization and remodeling, which create a scaffold critical to IS formation and stabilization, are TCR mediated. However, the mechanisms controlling simultaneous TCR and actin dynamic rearrangement in the IS are yet not fully understood. Herein, we identify two novel TCR ζ-chain motifs, mediating the TCR's direct interaction with actin and inducing actin bundling. While T cells expressing the ζ-chain mutated in these motifs lack cytoskeleton (actin) associated (cska)-TCRs, they express normal levels of non-cska and surface TCRs as cells expressing wild-type ζ-chain. However, such mutant cells are unable to display activation-dependent TCR clustering, IS formation, expression of CD25/CD69 activation markers, or produce/secrete cytokine, effects also seen in the corresponding APCs. We are the first to show a direct TCR-actin linkage, providing the missing gap linking between TCR-mediated Ag recognition, specific cytoskeleton orientation toward the T-cell-APC interacting pole and long-lived IS maintenance. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Columbia River Coordinated Information System (CIS), 1992-1993 Annual Report.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rowe, Mike; Roger, Phillip B.; O'Connor, Dick

1993-11-01

The purposes of this report are to: (1) describe the project to date; (2) to document the work and accomplishments of the (CIS) project for Fiscal Year 1993; and (3) to provide a glimpse of future project direction. The concept of a Coordinated Information System (CIS) as an approach to meeting the growing needs for regionally standardized anadromous fish information.
Connections between Transcription Downstream of Genes and cis-SAGe Chimeric RNA.

PubMed

Chwalenia, Katarzyna; Qin, Fujun; Singh, Sandeep; Tangtrongstittikul, Panjapon; Li, Hui

2017-11-22

cis-Splicing between adjacent genes (cis-SAGe) is being recognized as one way to produce chimeric fusion RNAs. However, its detail mechanism is not clear. Recent study revealed induction of transcriptions downstream of genes (DoGs) under osmotic stress. Here, we investigated the influence of osmotic stress on cis-SAGe chimeric RNAs and their connection to DoGs. We found,the absence of induction of at least some cis-SAGe fusions and/or their corresponding DoGs at early time point(s). In fact, these DoGs and their cis-SAGe fusions are inversely correlated. This negative correlation was changed to positive at a later time point. These results suggest a direct competition between the two categories of transcripts when total pool of readthrough transcripts is limited at an early time point. At a later time point, DoGs and corresponding cis-SAGe fusions are both induced, indicating that total readthrough transcripts become more abundant. Finally, we observed overall enhancement of cis-SAGe chimeric RNAs in KCl-treated samples by RNA-Seq analysis.
A novel swarm intelligence algorithm for finding DNA motifs.

PubMed

Lei, Chengwei; Ruan, Jianhua

2009-01-01

Discovering DNA motifs from co-expressed or co-regulated genes is an important step towards deciphering complex gene regulatory networks and understanding gene functions. Despite significant improvement in the last decade, it still remains one of the most challenging problems in computational molecular biology. In this work, we propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimisation technique called Particle Swarm Optimisation (PSO), which has been shown to be effective in optimising difficult multidimensional problems in continuous domains. We propose to use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs, and propose a modification of the naive PSO algorithm to accommodate discrete variables. In order to improve efficiency, we also propose several strategies for escaping from local optima and for automatically determining the termination criteria. Experimental results on simulated challenge problems show that our method is both more efficient and more accurate than several existing algorithms. Applications to several sets of real promoter sequences also show that our approach is able to detect known transcription factor binding sites, and outperforms two of the most popular existing algorithms.
Noncoding RNA danger motifs bridge innate and adaptive immunity and are potent adjuvants for vaccination

PubMed Central

Wang, Lilin; Smith, Dan; Bot, Simona; Dellamary, Luis; Bloom, Amy; Bot, Adrian

2002-01-01

The adaptive immune response is triggered by recognition of T and B cell epitopes and is influenced by “danger” motifs that act via innate immune receptors. This study shows that motifs associated with noncoding RNA are essential features in the immune response reminiscent of viral infection, mediating rapid induction of proinflammatory chemokine expression, recruitment and activation of antigen-presenting cells, modulation of regulatory cytokines, subsequent differentiation of Th1 cells, isotype switching, and stimulation of cross-priming. The heterogeneity of RNA-associated motifs results in differential binding to cellular receptors, and specifically impacts the immune profile. Naturally occurring double-stranded RNA (dsRNA) triggered activation of dendritic cells and enhancement of specific immunity, similar to selected synthetic dsRNA motifs. Based on the ability of specific RNA motifs to block tolerance induction and effectively organize the immune defense during viral infection, we conclude that such RNA species are potent danger motifs. We also demonstrate the feasibility of using selected RNA motifs as adjuvants in the context of novel aerosol carriers for optimizing the immune response to subunit vaccines. In conclusion, RNA-associated motifs produced during viral infection bridge the early response with the late adaptive phase, regulating the activation and differentiation of antigen-specific B and T cells, in addition to a short-term impact on innate immunity. PMID:12393853
The retina visual cycle is driven by cis retinol oxidation in the outer segments of cones

PubMed Central

Sato, Shinya; Frederiksen, Rikard; Cornwall, M. Carter; Kefalov, Vladimir J.

2017-01-01

Vertebrate rod and cone photoreceptors require continuous supply of chromophore for regenerating their visual pigments after photoactivation. Cones, which mediate our daytime vision, demand a particularly rapid supply of 11-cis retinal chromophore in order to maintain their function in bright light. An important contribution to this process is thought to be the chromophore precursor 11-cis retinol, which is supplied to cones from Müller cells in the retina and subsequently oxidized to 11-cis retinal as part of the retina visual cycle. However, the molecular identity of the cis retinol oxidase in cones remains unclear. Here, as a first step in characterizing this enzymatic reaction, we sought to determine the subcellular localization of this activity in salamander red cones. We found that the onset of dark adaptation of isolated salamander red cones was substantially faster when exposing directly their outer vs. their inner segment to 9-cis retinol, an analogue of 11-cis retinol. In contrast, this difference was not observed when treating the outer vs. inner segment with 9-cis retinal, a chromophore analogue which can directly support pigment regeneration. These results suggest, surprisingly, that the cis-retinol oxidation occurs in the outer segments of cone photoreceptors. Confirming this notion, pigment regeneration with exogenously added 9-cis retinol was directly observed in the truncated outer segments of cones, but not in rods. We conclude that the enzymatic machinery required for the oxidation of recycled cis retinol as part of the retina visual cycle is present in the outer segments of cones. PMID:28359344
RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching

NASA Astrophysics Data System (ADS)

Nasalean, Lorena; Stombaugh, Jesse; Zirbel, Craig L.; Leontis, Neocles B.

Structured RNA molecules resemble proteins in the hierarchical organization of their global structures, folding and broad range of functions. Structured RNAs are composed of recurrent modular motifs that play specific functional roles. Some motifs direct the folding of the RNA or stabilize the folded structure through tertiary interactions. Others bind ligands or proteins or catalyze chemical reactions. Therefore, it is desirable, starting from the RNA sequence, to be able to predict the locations of recurrent motifs in RNA molecules. Conversely, the potential occurrence of one or more known 3D RNA motifs may indicate that a genomic sequence codes for a structured RNA molecule. To identify known RNA structural motifs in new RNA sequences, precise structure-based definitions are needed that specify the core nucleotides of each motif and their conserved interactions. By comparing instances of each recurrent motif and applying base pair isosteriCity relations, one can identify neutral mutations that preserve its structure and function in the contexts in which it occurs.
CstF-64 and 3′-UTR cis-element determine Star-PAP specificity for target mRNA selection by excluding PAPα

PubMed Central

Kandala, Divya T.; Mohan, Nimmy; A, Vivekanand; AP, Sudheesh; G, Reshmi; Laishram, Rakesh S.

2016-01-01

Almost all eukaryotic mRNAs have a poly (A) tail at the 3′-end. Canonical PAPs (PAPα/γ) polyadenylate nuclear pre-mRNAs. The recent identification of the non-canonical Star-PAP revealed specificity of nuclear PAPs for pre-mRNAs, yet the mechanism how Star-PAP selects mRNA targets is still elusive. Moreover, how Star-PAP target mRNAs having canonical AAUAAA signal are not regulated by PAPα is unclear. We investigate specificity mechanisms of Star-PAP that selects pre-mRNA targets for polyadenylation. Star-PAP assembles distinct 3′-end processing complex and controls pre-mRNAs independent of PAPα. We identified a Star-PAP recognition nucleotide motif and showed that suboptimal DSE on Star-PAP target pre-mRNA 3′-UTRs inhibit CstF-64 binding, thus preventing PAPα recruitment onto it. Altering 3′-UTR cis-elements on a Star-PAP target pre-mRNA can switch the regulatory PAP from Star-PAP to PAPα. Our results suggest a mechanism of poly (A) site selection that has potential implication on the regulation of alternative polyadenylation. PMID:26496945
Direct synthesis of cis-dihalido-bis(NHC) complex of nickel(II) and catalytic application in olefin addition polymerization: effect of halogen co-ligands and density functional theory study.

PubMed

Zhang, Dao; Zhou, Sen; Li, Zhiming; Wang, Quanrui; Weng, Linhong

2013-09-07

Two novel amine-containing N-heterocyclic carbene ligand precursors [H(1a-b)]Br have been prepared in good yield and fully characterized. Direct syntheses of cis- and trans-dihalido-bis(NHC) nickel complexes [Ni(NHC)2X2] (X = Cl, Br) are reported. The solid structures of trans-[Ni(1a-b)2Br2] (2a-b) and cis-[Ni(1a)2Cl2] (3) were determined by single-crystal X-ray analysis and 3 was found to be the first example of cis-configuration coordination of monodentate NHC ligands to a metal center for dihalido-bis(NHC) nickel complexes. DFT calculations were conducted to determine the energy difference between cis- and trans-isomers of complexes 2a and 3 bearing bromide and chloride co-ligands. The cis-[Ni(1a)2Cl2] (cis-3) is 1.77-1.55 kcal mol(-1) lower in energy than its trans-isomer in polar solvents including CH2Cl2 and THF, while the trans-[Ni(1a)2Br2] (trans-2a) is more stable than the cis-isomer similarly in the gas phase. The cis nickel complex 3 with two coordinated monodentate NHCs was tested for olefin addition polymerization at standard conditions. It was found that cis-3 was inactive in ethylene polymerization but showed moderate catalytic activities (0.5-3.0 × 10(6) g of PNB (mol of Ni)(-1) h(-1)) in the addition polymerization of norbornene in the presence of methylaluminoxane (MAO) as cocatalyst.
MicroRNA-mediated regulatory circuits: outlook and perspectives

NASA Astrophysics Data System (ADS)

Cora', Davide; Re, Angela; Caselle, Michele; Bussolino, Federico

2017-08-01

MicroRNAs have been found to be necessary for regulating genes implicated in almost all signaling pathways, and consequently their dysfunction influences many diseases, including cancer. Understanding of the complexity of the microRNA-mediated regulatory network has grown in terms of size, connectivity and dynamics with the development of computational and, more recently, experimental high-throughput approaches for microRNA target identification. Newly developed studies on recurrent microRNA-mediated circuits in regulatory networks, also known as network motifs, have substantially contributed to addressing this complexity, and therefore to helping understand the ways by which microRNAs achieve their regulatory role. This review provides a summarizing view of the state-of-the-art, and perspectives of research efforts on microRNA-mediated regulatory motifs. In this review, we discuss the topological properties characterizing different types of circuits, and the regulatory features theoretically enabled by such properties, with a special emphasis on examples of circuits typifying their biological significance in experimentally validated contexts. Finally, we will consider possible future developments, in particular regarding microRNA-mediated circuits involving long non-coding RNAs and epigenetic regulators.
Drought responsive gene expression regulatory divergence between upland and lowland ecotypes of a perennial C4 grass.

PubMed

Lovell, John T; Schwartz, Scott; Lowry, David B; Shakirov, Eugene V; Bonnette, Jason E; Weng, Xiaoyu; Wang, Mei; Johnson, Jenifer; Sreedasyam, Avinash; Plott, Christopher; Jenkins, Jerry; Schmutz, Jeremy; Juenger, Thomas E

2016-04-01

Climatic adaptation is an example of a genotype-by-environment interaction (G×E) of fitness. Selection upon gene expression regulatory variation can contribute to adaptive phenotypic diversity; however, surprisingly few studies have examined how genome-wide patterns of gene expression G×E are manifested in response to environmental stress and other selective agents that cause climatic adaptation. Here, we characterize drought-responsive expression divergence between upland (drought-adapted) and lowland (mesic) ecotypes of the perennial C4 grass,Panicum hallii, in natural field conditions. Overall, we find that cis-regulatory elements contributed to gene expression divergence across 47% of genes, 7.2% of which exhibit drought-responsive G×E. While less well-represented, we observe 1294 genes (7.8%) with transeffects.Trans-by-environment interactions are weaker and much less common than cis G×E, occurring in only 0.7% oft rans-regulated genes. Finally, gene expression heterosis is highly enriched in expression phenotypes with significant G×E. As such, modes of inheritance that drive heterosis, such as dominance or overdominance, may be common among G×E genes. Interestingly, motifs specific to drought-responsive transcription factors are highly enriched in the promoters of genes exhibiting G×E and transregulation, indicating that expression G×E and heterosis may result from the evolution of transcription factors or their binding sites.P. hallii serves as the genomic model for its close relative and emerging biofuel crop, switchgrass (Panicum virgatum). Accordingly, the results here not only aid in the discovery of the genetic mechanisms that underlie local adaptation but also provide a foundation to improve switchgrass yield under water-limited conditions. © 2016 Lovell et al.; Published by Cold Spring Harbor Laboratory Press.
Drought responsive gene expression regulatory divergence between upland and lowland ecotypes of a perennial C4 grass

PubMed Central

Lovell, John T.; Schwartz, Scott; Lowry, David B.; Shakirov, Eugene V.; Bonnette, Jason E.; Weng, Xiaoyu; Wang, Mei; Johnson, Jenifer; Sreedasyam, Avinash; Plott, Christopher; Jenkins, Jerry; Schmutz, Jeremy; Juenger, Thomas E.

2016-01-01

Climatic adaptation is an example of a genotype-by-environment interaction (G×E) of fitness. Selection upon gene expression regulatory variation can contribute to adaptive phenotypic diversity; however, surprisingly few studies have examined how genome-wide patterns of gene expression G×E are manifested in response to environmental stress and other selective agents that cause climatic adaptation. Here, we characterize drought-responsive expression divergence between upland (drought-adapted) and lowland (mesic) ecotypes of the perennial C4 grass, Panicum hallii, in natural field conditions. Overall, we find that cis-regulatory elements contributed to gene expression divergence across 47% of genes, 7.2% of which exhibit drought-responsive G×E. While less well-represented, we observe 1294 genes (7.8%) with trans effects. Trans-by-environment interactions are weaker and much less common than cis G×E, occurring in only 0.7% of trans-regulated genes. Finally, gene expression heterosis is highly enriched in expression phenotypes with significant G×E. As such, modes of inheritance that drive heterosis, such as dominance or overdominance, may be common among G×E genes. Interestingly, motifs specific to drought-responsive transcription factors are highly enriched in the promoters of genes exhibiting G×E and trans regulation, indicating that expression G×E and heterosis may result from the evolution of transcription factors or their binding sites. P. hallii serves as the genomic model for its close relative and emerging biofuel crop, switchgrass (Panicum virgatum). Accordingly, the results here not only aid in the discovery of the genetic mechanisms that underlie local adaptation but also provide a foundation to improve switchgrass yield under water-limited conditions. PMID:26953271
Cis-regulatory RNA elements that regulate specialized ribosome activity.

PubMed

Xue, Shifeng; Barna, Maria

2015-01-01

Recent evidence has shown that the ribosome itself can play a highly regulatory role in the specialized translation of specific subpools of mRNAs, in particular at the level of ribosomal proteins (RP). However, the mechanism(s) by which this selection takes place has remained poorly understood. In our recent study, we discovered a combination of unique RNA elements in the 5'UTRs of mRNAs that allows for such control by the ribosome. These mRNAs contain a Translation Inhibitory Element (TIE) that inhibits general cap-dependent translation, and an Internal Ribosome Entry Site (IRES) that relies on a specific RP for activation. The unique combination of an inhibitor of general translation and an activator of specialized translation is key to ribosome-mediated control of gene expression. Here we discuss how these RNA regulatory elements provide a new level of control to protein expression and their implications for gene expression, organismal development and evolution.
Entry and Exit Mechanisms at the cis-Face of the Golgi Complex

PubMed Central

Lorente-Rodríguez, Andrés; Barlowe, Charles

2011-01-01

Vesicular transport of protein and lipid cargo from the endoplasmic reticulum (ER) to cis-Golgi compartments depends on coat protein complexes, Rab GTPases, tethering factors, and membrane fusion catalysts. ER-derived vesicles deliver cargo to an ER-Golgi intermediate compartment (ERGIC) that then fuses with and/or matures into cis-Golgi compartments. The forward transport pathway to cis-Golgi compartments is balanced by a retrograde directed pathway that recycles transport machinery back to the ER. How trafficking through the ERGIC and cis-Golgi is coordinated to maintain organelle structure and function is poorly understood and highlights central questions regarding trafficking routes and organization of the early secretory pathway. PMID:21482742
Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

PubMed Central

Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

2013-01-01

The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

PubMed Central

Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

2017-01-01

Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546
Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

PubMed Central

Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

2012-01-01

Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235
Generation of oscillating gene regulatory network motifs

NASA Astrophysics Data System (ADS)

van Dorp, M.; Lannoo, B.; Carlon, E.

2013-07-01

Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.
Characterization of new regulatory elements within the Drosophila bithorax complex.

PubMed

Pérez-Lluch, Sílvia; Cuartero, Sergi; Azorín, Fernando; Espinàs, M Lluïsa

2008-12-01

The homeotic Abdominal-B (Abd-B) gene expression depends on a modular cis-regulatory region divided into discrete functional domains (iab) that control the expression of the gene in a particular segment of the fly. These domains contain regulatory elements implicated in both initiation and maintenance of homeotic gene expression and elements that separate the different domains. In this paper we have performed an extensive analysis of the iab-6 regulatory region, which regulates Abd-B expression at abdominal segment A6 (PS11), and we have characterized two new polycomb response elements (PREs) within this domain. We report that PREs at Abd-B cis-regulatory domains present a particular chromatin structure which is nuclease accessible all along Drosophila development and both in active and repressed states. We also show that one of these regions contains a dCTCF and CP190 dependent activity in transgenic enhancer-blocking assays, suggesting that it corresponds to the Fab-6 boundary element of the Drosophila bithorax complex.
Metabolism of oral 9-cis-retinoic acid in the human. Identification of 9-cis-retinoyl-beta-glucuronide and 9-cis-4-oxo-retinoyl-beta-glucuronide as urinary metabolites.

PubMed

Sass, J O; Masgrau, E; Saurat, J H; Nau, H

1995-09-01

Data from a number of investigators suggest that the 9-cis-isomer of RA1 (9-cis-RA) may be a promising agent in chemoprevention and treatment of certain types of cancer. Therefore, clinical studies on this retinoid have been initiated. However, up to now, no information has been published on the metabolism of 9-cis-RA in the human. Herein, we report the first data on retinoid metabolism after multiple administration of 9-cis-RA (20 mg/day po) to human volunteers. After 2 and 12-13 hr, plasma concentrations of 9-cis-RA and its metabolites 9,13-dicis-RA, 13-cis-RA, and all-trans-RA were low. In contrast, dosing with 13-cis-RA yielded much higher plasma retinoid levels. Effects on plasma retinol concentrations did not become obvious after any drug treatment. Several retinoid metabolites were found in the urine of 9-cis-RA-treated individuals, and 9-cis-RAG, as well as 9-cis-4-oxo-RAG, could be identified. After treatment with 9-cis-RA, high concentrations of the administered drug were found in the feces, along with comparably low concentrations of 13-cis-RA, 9,13-dicis-RA, and all-trans-RA. Our report indicates that 9-cis-RA is either eliminated much more rapidly than 13-cis-RA, or it is poorly absorbed, and presents the characterization of two urinary glucuronides.
A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif.

PubMed

Kandimalla, Ekambar R; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X; Tang, Jin-Yan; Knetter, Cathrine F; Lien, Egil; Agrawal, Sudhir

2003-11-25

Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2'-deoxy-beta-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3'-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-kappa B and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-gamma, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9.
An Active Insect Kinin Analog with 4-Aminopyroglutamate, A Novel cis-Peptide Bond, Type VI beta-Turn Motif

DTIC Science & Technology

2004-01-01

2004 Wiley Periodicals, Inc.* Biopolymers 75: 412–419, 2004 Keywords: 4-aminopyroglutamic acid ; cis-peptide bond; -turn mimetic; constrained insect...biological evaluation of an insect kinin analog containing a novel, (2S,4S)-4-aminopyroglutamic acid (APy) com- ponent (Figure 1) that theoretical and...cricket diuretic bioassay system. FIGURE 1 A comparison of the structures of the tetrazole ([CN4], left) and 4-aminopyroglu- tamic acid (APy; right
Dynamic changes in Sox2 spatio-temporal expression promote the second cell fate decision through Fgf4/Fgfr2 signaling in preimplantation mouse embryos.

PubMed

Mistri, Tapan Kumar; Arindrarto, Wibowo; Ng, Wei Ping; Wang, Choayang; Lim, Leng Hiong; Sun, Lili; Chambers, Ian; Wohland, Thorsten; Robson, Paul

2018-03-20

Oct4 and Sox2 regulate the expression of target genes such as Nanog, Fgf4 , and Utf1 , by binding to their respective regulatory motifs. Their functional cooperation is reflected in their ability to heterodimerize on adjacent cis regulatory motifs, the composite Sox/Oct motif. Given that Oct4 and Sox2 regulate many developmental genes, a quantitative analysis of their synergistic action on different Sox/Oct motifs would yield valuable insights into the mechanisms of early embryonic development. In the present study, we measured binding affinities of Oct4 and Sox2 to different Sox/Oct motifs using fluorescence correlation spectroscopy. We found that the synergistic binding interaction is driven mainly by the level of Sox2 in the case of the Fgf4 Sox/Oct motif. Taking into account Sox2 expression levels fluctuate more than Oct4 , our finding provides an explanation on how Sox2 controls the segregation of the epiblast and primitive endoderm populations within the inner cell mass of the developing rodent blastocyst. © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

PubMed Central

2014-01-01

Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784
La-related protein 1 (LARP1) binds the mRNA cap, blocking eIF4F assembly on TOP mRNAs

PubMed Central

Lahr, Roni M; Fonseca, Bruno D; Ciotti, Gabrielle E; Al-Ashtal, Hiba A; Jia, Jian-Jun; Niklaus, Marius R; Blagden, Sarah P; Alain, Tommy; Berman, Andrea J

2017-01-01

The 5’terminal oligopyrimidine (5’TOP) motif is a cis-regulatory RNA element located immediately downstream of the 7-methylguanosine [m7G] cap of TOP mRNAs, which encode ribosomal proteins and translation factors. In eukaryotes, this motif coordinates the synchronous and stoichiometric expression of the protein components of the translation machinery. La-related protein 1 (LARP1) binds TOP mRNAs, regulating their stability and translation. We present crystal structures of the human LARP1 DM15 region in complex with a 5’TOP motif, a cap analog (m7GTP), and a capped cytidine (m7GpppC), resolved to 2.6, 1.8 and 1.7 Å, respectively. Our binding, competition, and immunoprecipitation data corroborate and elaborate on the mechanism of 5’TOP motif binding by LARP1. We show that LARP1 directly binds the cap and adjacent 5’TOP motif of TOP mRNAs, effectively impeding access of eIF4E to the cap and preventing eIF4F assembly. Thus, LARP1 is a specialized TOP mRNA cap-binding protein that controls ribosome biogenesis. DOI: http://dx.doi.org/10.7554/eLife.24146.001 PMID:28379136
Placing a Disrupted Degradation Motif at the C Terminus of Proteasome Substrates Attenuates Degradation without Impairing Ubiquitylation*

PubMed Central

Alfassy, Omri S.; Cohen, Itamar; Reiss, Yuval; Tirosh, Boaz; Ravid, Tommer

2013-01-01

Protein elimination by the ubiquitin-proteasome system requires the presence of a cis-acting degradation signal. Efforts to discern degradation signals of misfolded proteasome substrates thus far revealed a general mechanism whereby the exposure of cryptic hydrophobic motifs provides a degradation determinant. We have previously characterized such a determinant, employing the yeast kinetochore protein Ndc10 as a model substrate. Ndc10 is essentially a stable protein that is rapidly degraded upon exposure of a hydrophobic motif located at the C-terminal region. The degradation motif comprises two distinct and essential elements: DegA, encompassing two amphipathic helices, and DegB, a hydrophobic sequence within the loosely structured C-terminal tail of Ndc10. Here we show that the hydrophobic nature of DegB is irrelevant for the ubiquitylation of substrates containing the Ndc10 degradation motif, but is essential for proteasomal degradation. Mutant DegB, in which the hydrophobic sequence was disrupted, acted as a dominant degradation inhibitory element when expressed at the C-terminal regions of ubiquitin-dependent and -independent substrates of the 26S proteasome. This mutant stabilized substrates in both yeast and mammalian cells, indicative of a modular recognition moiety. The dominant function of the mutant DegB provides a powerful experimental tool for evaluating the physiological implications of stabilization of specific proteasome substrates in intact cells and for studying the associated pathological effects. PMID:23519465
The nitrogen responsive transcriptome in potato (Solanum tuberosum L.) reveals significant gene regulatory motifs.

PubMed

Gálvez, José Héctor; Tai, Helen H; Lagüe, Martin; Zebarth, Bernie J; Strömvik, Martina V

2016-05-19

Nitrogen (N) is the most important nutrient for the growth of potato (Solanum tuberosum L.). Foliar gene expression in potato plants with and without N supplementation at 180 kg N ha(-1) was compared at mid-season. Genes with consistent differences in foliar expression due to N supplementation over three cultivars and two developmental time points were examined. In total, thirty genes were found to be over-expressed and nine genes were found to be under-expressed with supplemented N. Functional relationships between over-expressed genes were found. The main metabolic pathway represented among differentially expressed genes was amino acid metabolism. The 1000 bp upstream flanking regions of the differentially expressed genes were analysed and nine overrepresented motifs were found using three motif discovery algorithms (Seeder, Weeder and MEME). These results point to coordinated gene regulation at the transcriptional level controlling steady state potato responses to N sufficiency.
The nitrogen responsive transcriptome in potato (Solanum tuberosum L.) reveals significant gene regulatory motifs

PubMed Central

Gálvez, José Héctor; Tai, Helen H.; Lagüe, Martin; Zebarth, Bernie J.; Strömvik, Martina V.

2016-01-01

Nitrogen (N) is the most important nutrient for the growth of potato (Solanum tuberosum L.). Foliar gene expression in potato plants with and without N supplementation at 180 kg N ha−1 was compared at mid-season. Genes with consistent differences in foliar expression due to N supplementation over three cultivars and two developmental time points were examined. In total, thirty genes were found to be over-expressed and nine genes were found to be under-expressed with supplemented N. Functional relationships between over-expressed genes were found. The main metabolic pathway represented among differentially expressed genes was amino acid metabolism. The 1000 bp upstream flanking regions of the differentially expressed genes were analysed and nine overrepresented motifs were found using three motif discovery algorithms (Seeder, Weeder and MEME). These results point to coordinated gene regulation at the transcriptional level controlling steady state potato responses to N sufficiency. PMID:27193058
The CcpA regulon of Streptococcus suis reveals novel insights into the regulation of the streptococcal central carbon metabolism by binding of CcpA to two distinct binding motifs.

PubMed

Willenborg, Jörg; de Greeff, Astrid; Jarek, Michael; Valentin-Weigand, Peter; Goethe, Ralph

2014-04-01

Streptococcus suis (S. suis) is a neglected zoonotic streptococcus causing fatal diseases in humans and in pigs. The transcriptional regulator CcpA (catabolite control protein A) is involved in the metabolic adaptation to different carbohydrate sources and virulence of S. suis and other pathogenic streptococci. In this study, we determined the DNA binding characteristics of CcpA and identified the CcpA regulon during growth of S. suis. Electrophoretic mobility shift analyses showed promiscuous DNA binding of CcpA to cognate cre sites in vitro. In contrast, sequencing of immunoprecipitated chromatin revealed two specific consensus motifs, a pseudo-palindromic cre motif (WWGAAARCGYTTTCWW) and a novel cre2 motif (TTTTYHWDHHWWTTTY), within the regulatory elements of the genes directly controlled by CcpA. Via these elements CcpA regulates expression of genes involved in carbohydrate uptake and conversion, and in addition in important metabolic pathways of the central carbon metabolism, like glycolysis, mixed-acid fermentation, and the fragmentary TCA cycle. Furthermore, our analyses provide evidence that CcpA regulates the genes of the central carbon metabolism by binding either the pseudo-palindromic cre motif or the cre2 motif in a HPr(Ser)∼P independent conformation. © 2014 John Wiley & Sons Ltd.
[Prediction of Promoter Motifs in Virophages].

PubMed

Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

2015-07-01

Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.
Initial deployment of the cardiogenic gene regulatory network in the basal chordate, Ciona intestinalis.

PubMed

Woznica, Arielle; Haeussler, Maximilian; Starobinska, Ella; Jemmett, Jessica; Li, Younan; Mount, David; Davidson, Brad

2012-08-01

The complex, partially redundant gene regulatory architecture underlying vertebrate heart formation has been difficult to characterize. Here, we dissect the primary cardiac gene regulatory network in the invertebrate chordate, Ciona intestinalis. The Ciona heart progenitor lineage is first specified by Fibroblast Growth Factor/Map Kinase (FGF/MapK) activation of the transcription factor Ets1/2 (Ets). Through microarray analysis of sorted heart progenitor cells, we identified the complete set of primary genes upregulated by FGF/Ets shortly after heart progenitor emergence. Combinatorial sequence analysis of these co-regulated genes generated a hypothetical regulatory code consisting of Ets binding sites associated with a specific co-motif, ATTA. Through extensive reporter analysis, we confirmed the functional importance of the ATTA co-motif in primary heart progenitor gene regulation. We then used the Ets/ATTA combination motif to successfully predict a number of additional heart progenitor gene regulatory elements, including an intronic element driving expression of the core conserved cardiac transcription factor, GATAa. This work significantly advances our understanding of the Ciona heart gene network. Furthermore, this work has begun to elucidate the precise regulatory architecture underlying the conserved, primary role of FGF/Ets in chordate heart lineage specification. Copyright © 2012 Elsevier Inc. All rights reserved.
Targeting Peripheral-Derived Regulatory T Cells as a Means of Enhancing Immune Responses Directed against Prostate Cancer

DTIC Science & Technology

2017-08-01

Award Number: W81XWH-15-1-0328 TITLE: Targeting Peripheral-Derived Regulatory T Cells as a Means of Enhancing Immune Responses Directed against...1 August 2016 - 31 July 2017 4. TITLE AND SUBTITLE Targeting Peripheral-Derived Regulatory T Cells as a Means of Enhancing Immune Responses Directed...discovered that a subset of regulatory T cells (Tregs), termed peripheral-derived Tregs (pTregs), impair immune responses directed against tumor
A transcription factor collective defines the HSN serotonergic neuron regulatory landscape.

PubMed

Lloret-Fernández, Carla; Maicas, Miren; Mora-Martínez, Carlos; Artacho, Alejandro; Jimeno-Martín, Ángela; Chirivella, Laura; Weinberg, Peter; Flames, Nuria

2018-03-22

Cell differentiation is controlled by individual transcription factors (TFs) that together activate a selection of enhancers in specific cell types. How these combinations of TFs identify and activate their target sequences remains poorly understood. Here, we identify the cis -regulatory transcriptional code that controls the differentiation of serotonergic HSN neurons in Caenorhabditis elegans . Activation of the HSN transcriptome is directly orchestrated by a collective of six TFs. Binding site clusters for this TF collective form a regulatory signature that is sufficient for de novo identification of HSN neuron functional enhancers. Among C. elegans neurons, the HSN transcriptome most closely resembles that of mouse serotonergic neurons. Mouse orthologs of the HSN TF collective also regulate serotonergic differentiation and can functionally substitute for their worm counterparts which suggests deep homology. Our results identify rules governing the regulatory landscape of a critically important neuronal type in two species separated by over 700 million years. © 2018, Lloret-Fernández et al.
A transcription factor collective defines the HSN serotonergic neuron regulatory landscape

PubMed Central

Artacho, Alejandro; Jimeno-Martín, Ángela; Chirivella, Laura; Weinberg, Peter

2018-01-01

Cell differentiation is controlled by individual transcription factors (TFs) that together activate a selection of enhancers in specific cell types. How these combinations of TFs identify and activate their target sequences remains poorly understood. Here, we identify the cis-regulatory transcriptional code that controls the differentiation of serotonergic HSN neurons in Caenorhabditis elegans. Activation of the HSN transcriptome is directly orchestrated by a collective of six TFs. Binding site clusters for this TF collective form a regulatory signature that is sufficient for de novo identification of HSN neuron functional enhancers. Among C. elegans neurons, the HSN transcriptome most closely resembles that of mouse serotonergic neurons. Mouse orthologs of the HSN TF collective also regulate serotonergic differentiation and can functionally substitute for their worm counterparts which suggests deep homology. Our results identify rules governing the regulatory landscape of a critically important neuronal type in two species separated by over 700 million years. PMID:29553368
Organocatalytic, Diastereo- and Enantioselective Synthesis of Nonsymmetric cis-Stilbene Diamines: A Platform for the Preparation of Single-Enantiomer cis-Imidazolines for Protein–Protein Inhibition

PubMed Central

2015-01-01

The finding by scientists at Hoffmann-La Roche that cis-imidazolines could disrupt the protein–protein interaction between p53 and MDM2, thereby inducing apoptosis in cancer cells, raised considerable interest in this scaffold over the past decade. Initial routes to these small molecules (i.e., Nutlin-3) provided only the racemic form, with enantiomers being enriched by chromatographic separation using high-pressure liquid chromatography (HPLC) and a chiral stationary phase. Reported here is the first application of an enantioselective aza-Henry approach to nonsymmetric cis-stilbene diamines and cis-imidazolines. Two novel mono(amidine) organocatalysts (MAM) were discovered to provide high levels of enantioselection (>95% ee) across a broad range of substrate combinations. Furthermore, the versatility of the aza-Henry strategy for preparing nonsymmetric cis-imidazolines is illustrated by a comparison of the roles of aryl nitromethane and aryl aldimine in the key step, which revealed unique substrate electronic effects providing direction for aza-Henry substrate–catalyst matching. This method was used to prepare highly substituted cis-4,5-diaryl imidazolines that project unique aromatic rings, and these were evaluated for MDM2-p53 inhibition in a fluorescence polarization assay. The diversification of access to cis-stilbene diamine-derived imidazolines provided by this platform should streamline their further development as chemical tools for disrupting protein–protein interactions. PMID:25017623

Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection

PubMed Central

Sun, Hong; Guns, Tias; Fierro, Ana Carolina; Thorrez, Lieven; Nijssen, Siegfried; Marchal, Kathleen

2012-01-01

Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the binding sites of the assayed transcription factor (TF) should be located, but also allows restricting the valid CRMs to those that contain the assayed TF (here referred to as applying CRM detection in a query-based mode). In this study, we show that exploiting ChIP-information in a query-based way makes in silico CRM detection a much more feasible endeavor. To be able to handle the large datasets, the query-based setting and other specificities proper to CRM detection on ChIP-Seq based data, we developed a novel powerful CRM detection method ‘CPModule’. By applying it on a well-studied ChIP-Seq data set involved in self-renewal of mouse embryonic stem cells, we demonstrate how our tool can recover combinatorial regulation of five known TFs that are key in the self-renewal of mouse embryonic stem cells. Additionally, we make a number of new predictions on combinatorial regulation of these five key TFs with other TFs documented in TRANSFAC. PMID:22422841
Single nucleotide resolution RNA-seq uncovers new regulatory mechanisms in the opportunistic pathogen Streptococcus agalactiae.

PubMed

Rosinski-Chupin, Isabelle; Sauvage, Elisabeth; Sismeiro, Odile; Villain, Adrien; Da Cunha, Violette; Caliot, Marie-Elise; Dillies, Marie-Agnès; Trieu-Cuot, Patrick; Bouloc, Philippe; Lartigue, Marie-Frédérique; Glaser, Philippe

2015-05-30

Streptococcus agalactiae, or Group B Streptococcus, is a leading cause of neonatal infections and an increasing cause of infections in adults with underlying diseases. In an effort to reconstruct the transcriptional networks involved in S. agalactiae physiology and pathogenesis, we performed an extensive and robust characterization of its transcriptome through a combination of differential RNA-sequencing in eight different growth conditions or genetic backgrounds and strand-specific RNA-sequencing. Our study identified 1,210 transcription start sites (TSSs) and 655 transcript ends as well as 39 riboswitches and cis-regulatory regions, 39 cis-antisense non-coding RNAs and 47 small RNAs potentially acting in trans. Among these putative regulatory RNAs, ten were differentially expressed in response to an acid stress and two riboswitches sensed directly or indirectly the pH modification. Strikingly, 15% of the TSSs identified were associated with the incorporation of pseudo-templated nucleotides, showing that reiterative transcription is a pervasive process in S. agalactiae. In particular, 40% of the TSSs upstream genes involved in nucleotide metabolism show reiterative transcription potentially regulating gene expression, as exemplified for pyrG and thyA encoding the CTP synthase and the thymidylate synthase respectively. This comprehensive map of the transcriptome at the single nucleotide resolution led to the discovery of new regulatory mechanisms in S. agalactiae. It also provides the basis for in depth analyses of transcriptional networks in S. agalactiae and of the regulatory role of reiterative transcription following variations of intra-cellular nucleotide pools.
Potential Direct Regulators of the Drosophila yellow Gene Identified by Yeast One-Hybrid and RNAi Screens

PubMed Central

Kalay, Gizem; Lusk, Richard; Dome, Mackenzie; Hens, Korneel; Deplancke, Bart; Wittkopp, Patricia J.

2016-01-01

The regulation of gene expression controls development, and changes in this regulation often contribute to phenotypic evolution. Drosophila pigmentation is a model system for studying evolutionary changes in gene regulation, with differences in expression of pigmentation genes such as yellow that correlate with divergent pigment patterns among species shown to be caused by changes in cis- and trans-regulation. Currently, much more is known about the cis-regulatory component of divergent yellow expression than the trans-regulatory component, in part because very few trans-acting regulators of yellow expression have been identified. This study aims to improve our understanding of the trans-acting control of yellow expression by combining yeast-one-hybrid and RNAi screens for transcription factors binding to yellow cis-regulatory sequences and affecting abdominal pigmentation in adults, respectively. Of the 670 transcription factors included in the yeast-one-hybrid screen, 45 showed evidence of binding to one or more sequence fragments tested from the 5′ intergenic and intronic yellow sequences from D. melanogaster, D. pseudoobscura, and D. willistoni, suggesting that they might be direct regulators of yellow expression. Of the 670 transcription factors included in the yeast-one-hybrid screen, plus another TF previously shown to be genetically upstream of yellow, 125 were also tested using RNAi, and 32 showed altered abdominal pigmentation. Nine transcription factors were identified in both screens, including four nuclear receptors related to ecdysone signaling (Hr78, Hr38, Hr46, and Eip78C). This finding suggests that yellow expression might be directly controlled by nuclear receptors influenced by ecdysone during early pupal development when adult pigmentation is forming. PMID:27527791
Activity of the rat osteocalcin basal promoter in osteoblastic cells is dependent upon homeodomain and CP1 binding motifs.

PubMed

Towler, D A; Bennett, C D; Rodan, G A

1994-05-01

A detailed analysis of the transcriptional machinery responsible for osteoblast-specific gene expression should provide tools useful for understanding osteoblast commitment and differentiation. We have defined three cis-elements important for basal activity of the rat osteocalcin (OC) promoter, located at about -200 to -180, -170 to -138, and -121 to -64 relative to the transcription initiation site. A motif (TCTGATTGTGT) present in the region between -200 and -170 that binds a multisubunit CP1/NFY/CBF-like CAAT factor complex contributes significantly to high level basal activity and presumably functions as the CAAT box for the rat OC promoter. We show that the region -121 to 32 is sufficient to confer osteoblastic cell type specificity in transient transfection assays of cultured cell lines using luciferase as a reporter. The basal promoter is active in rodent osteoblastic cell lines, but not in rodent fibroblastic or muscle cell lines. Although the rat OC box (-100 to -74) contains a CAAT motif, we could not detect CP1-like CAAT factor binding to this region. In fact, we demonstrate that a Msx-1 (Hox 7.1) homeodomain binding motif (ACTAATTG; bottom strand) in the 3'-end of the rat OC box is necessary for high level activity of the rat OC basal promoter in osteoblastic cells. A nuclear factor that recognizes this motif appears to be present in osteoblastic ROS 17/2.8 cells, which produce OC, but not in fibroblastic ROS 25/1 cells, which fail to express OC. This ROS 17/2.8 nuclear factor also recognizes the A/T-rich DNA cognates of the homeodomain-containing POU family of transcription factors. Taken together, these data suggest that a ubiquitous CP1-like CAAT factor and a cell type-restricted homeodomain containing (Msx or POU family) transcription factor interact with the proximal rat OC promoter to direct appropriate basal OC transcription in osteoblastic cells.
Two distinct auto-regulatory loops operate at the PU.1 locus in B cells and myeloid cells

PubMed Central

Leddin, Mathias; Perrod, Chiara; Hoogenkamp, Maarten; Ghani, Saeed; Assi, Salam; Heinz, Sven; Wilson, Nicola K.; Follows, George; Schönheit, Jörg; Vockentanz, Lena; Mosammam, Ali M.; Chen, Wei; Tenen, Daniel G.; Westhead, David R.; Göttgens, Berthold

2011-01-01

The transcription factor PU.1 occupies a central role in controlling myeloid and early B-cell development, and its correct lineage-specific expression is critical for the differentiation choice of hematopoietic progenitors. However, little is known of how this tissue-specific pattern is established. We previously identified an upstream regulatory cis element whose targeted deletion in mice decreases PU.1 expression and causes leukemia. We show here that the upstream regulatory cis element alone is insufficient to confer physiologic PU.1 expression in mice but requires the cooperation with other, previously unidentified elements. Using a combination of transgenic studies, global chromatin assays, and detailed molecular analyses we present evidence that PU.1 is regulated by a novel mechanism involving cross talk between different cis elements together with lineage-restricted autoregulation. In this model, PU.1 regulates its expression in B cells and macrophages by differentially associating with cell type–specific transcription factors at one of its cis-regulatory elements to establish differential activity patterns at other elements. PMID:21239694
Modeling protein homopolymeric repeats: possible polyglutamine structural motifs for Huntington's disease.

PubMed

Lathrop, R H; Casale, M; Tobias, D J; Marsh, J L; Thompson, L M

1998-01-01

We describe a prototype system (Poly-X) for assisting an expert user in modeling protein repeats. Poly-X reduces the large number of degrees of freedom required to specify a protein motif in complete atomic detail. The result is a small number of parameters that are easily understood by, and under the direct control of, a domain expert. The system was applied to the polyglutamine (poly-Q) repeat in the first exon of huntingtin, the gene implicated in Huntington's disease. We present four poly-Q structural motifs: two poly-Q beta-sheet motifs (parallel and antiparallel) that constitute plausible alternatives to a similar previously published poly-Q beta-sheet motif, and two novel poly-Q helix motifs (alpha-helix and pi-helix). To our knowledge, helical forms of polyglutamine have not been proposed before. The motifs suggest that there may be several plausible aggregation structures for the intranuclear inclusion bodies which have been found in diseased neurons, and may help in the effort to understand the structural basis for Huntington's disease.
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

PubMed Central

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

2012-01-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots.

PubMed

Wu, Min; Kwoh, Chee-Keong; Przytycka, Teresa M; Li, Jing; Zheng, Jie

2012-06-21

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots

PubMed Central

2012-01-01

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots. PMID:22759569
A three-dimensional RNA motif in Potato spindle tuber viroid mediates trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana.

PubMed

Takeda, Ryuta; Petrov, Anton I; Leontis, Neocles B; Ding, Biao

2011-01-01

Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5'-CGA-3'...5'-GAC-3' flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes.
Identification of conserved cis-elements and transcription factors required for sterol-regulated transcription of stearoyl-CoA desaturase 1 and 2.

PubMed

Tabor, D E; Kim, J B; Spiegelman, B M; Edwards, P A

1999-07-16

We previously identified stearoyl-CoA desaturase 2 (SCD2) as a new member of the family of genes that are transcriptionally regulated in response to changing levels of nuclear sterol regulatory element binding proteins (SREBPs) or adipocyte determination and differentiation factor 1 (ADD1). A novel sterol regulatory element (SRE) (5'-AGCAGATTGTG-3') identified in the proximal promoter of the mouse SCD2 gene is required for induction of SCD2 promoter-reporter genes in response to cellular sterol depletion (Tabor, D. E., Kim, J. B., Spiegelman, B. M., and Edwards, P. A. (1998) J. Biol. Chem. 273, 22052-22058). In this report, we demonstrate that this novel SRE is both present in the promoter of the SCD1 gene and is critical for the sterol-dependent transcription of SCD1 promoter-reporter genes. Two conserved cis elements (5'-CCAAT-3') lie 5 and 48 base pairs 3' of the novel SREs in the promoters of both the SCD1 and SCD2 murine genes. Mutation of either of these putative NF-Y binding sites attenuates the transcriptional activation of SCD1 or SCD2 promoter-reporter genes in response to cellular sterol deprivation. Induction of both reporter genes is also attenuated when cells are cotransfected with dominant-negative forms of either NF-Y or SREBP. In addition, we demonstrate that the induction of SCD1 and SCD2 mRNAs that occurs during the differentiation of 3T3-L1 preadipocytes to adipocytes is paralleled by an increase in the levels of ADD1/SREBP-1c and that the SCD1 and SCD2 mRNAs are induced to even higher levels in response to ectopic expression of ADD1/SREBP-1c. We conclude that transcription of both SCD1 and SCD2 genes is responsive to cellular sterol levels and to the levels of nuclear SREBP/ADD1 and that transcriptional induction requires three spatially conserved cis elements, that bind SREBP and NF-Y. Additional studies demonstrate that maximal transcriptional repression of SCD2 reporter genes in response to an exogenous polyunsaturated fatty acid is
Characterization of the mouse junD promoter--high basal level activity due to an octamer motif.

PubMed Central

de Groot, R P; Karperien, M; Pals, C; Kruijer, W

1991-01-01

The product of the junD gene belongs to the Jun/Fos family of nuclear DNA binding transcription factors. This family regulates the expression of TPA responsive genes by binding to the TPA responsive element (TRE). Unlike its counterparts c-jun and junB, junD expression is hardly inducible by growth factors and phorbol esters. In fact, junD is constitutively expressed at high levels in a wide variety of cells. To unravel the molecular mechanisms underlying constitutive junD expression, we have cloned and characterized the mouse junD promoter. We show that the high constitutive expression is caused by multiple cis-acting elements in its promoter, including an SP1 binding site, an octamer motif, a CAAT box, a Zif268 binding site and a TRE-like sequence. The octamer motif is the major determinant of junD promoter activity, while somewhat smaller contributions are made by the TRE and Zif268 binding site. The SP1 and CAAT box are shown to be of minor importance. The junD TRE is in its behavior indistinguishable from previously identified TREs. However, the junD promoter is not TPA inducible due to the presence of the octamer motif. Images PMID:1714380
Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

PubMed Central

2012-01-01

Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery
Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets.

PubMed

Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon

2012-01-01

To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches

PubMed Central

Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest

2007-01-01

WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794
Biological network motif detection and evaluation

PubMed Central

2011-01-01

Background Molecular level of biological data can be constructed into system level of data as biological networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have been used for many biological applications. Since network motif discovery involves computationally challenging processes, previous algorithms have focused on computational efficiency. However, we believe that the biological quality of network motifs is also very important. Results We define biological network motifs as biologically significant subgraphs and traditional network motifs are differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGEGO-BNM, EDGEBETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional module and GO term clustering score in this paper. Experimental results show that EDGEGO-BNM and EDGEBETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find structural network motifs as well. Conclusion We provide new approaches to finding network motifs in biological networks. Our algorithms efficiently detect biological network motifs and further improve existing algorithms to find high quality structural network motifs, which would be impossible using existing algorithms. The performances of the algorithms are compared based on our new evaluation measures in biological contexts. We believe that our work gives some guidelines of network motifs research for the biological networks. PMID:22784624
Coupling of tandem Smad ubiquitination regulatory factor (Smurf) WW domains modulates target specificity.

PubMed

Chong, P Andrew; Lin, Hong; Wrana, Jeffrey L; Forman-Kay, Julie D

2010-10-26

Smad ubiquitination regulatory factor 2 (Smurf2) is an E3 ubiquitin ligase that participates in degradation of TGF-β receptors and other targets. Smurf2 WW domains recognize PPXY (PY) motifs on ubiquitin ligase target proteins or on adapters, such as Smad7, that bind to E3 target proteins. We previously demonstrated that the isolated WW3 domain of Smurf2, but not the WW2 domain, can directly bind to a Smad7 PY motif. We show here that the WW2 augments this interaction by binding to the WW3 and making auxiliary contacts with the PY motif and a novel E/D-S/T-P motif, which is N-terminal to all Smad PY motifs. The WW2 likely enhances the selectivity of Smurf2 for the Smad proteins. NMR titrations confirm that Smad1 and Smad2 are bound by Smurf2 with the same coupled WW domain arrangement used to bind Smad7. The analogous WW domains in the short isoform of Smurf1 recognize the Smad7 PY peptide using the same coupled mechanism. However, a longer Smurf1 isoform, which has an additional 26 residues in the inter-WW domain linker, is only partially able to use the coupled WW domain binding mechanism. The longer linker results in a decrease in affinity for the Smad7 peptide. Interdomain coupling of WW domains enhances selectivity and enables the tuning of interactions by isoform switching.
Coupling of tandem Smad ubiquitination regulatory factor (Smurf) WW domains modulates target specificity

PubMed Central

Chong, P. Andrew; Lin, Hong; Wrana, Jeffrey L.; Forman-Kay, Julie D.

2010-01-01

Smad ubiquitination regulatory factor 2 (Smurf2) is an E3 ubiquitin ligase that participates in degradation of TGF-β receptors and other targets. Smurf2 WW domains recognize PPXY (PY) motifs on ubiquitin ligase target proteins or on adapters, such as Smad7, that bind to E3 target proteins. We previously demonstrated that the isolated WW3 domain of Smurf2, but not the WW2 domain, can directly bind to a Smad7 PY motif. We show here that the WW2 augments this interaction by binding to the WW3 and making auxiliary contacts with the PY motif and a novel E/D-S/T-P motif, which is N-terminal to all Smad PY motifs. The WW2 likely enhances the selectivity of Smurf2 for the Smad proteins. NMR titrations confirm that Smad1 and Smad2 are bound by Smurf2 with the same coupled WW domain arrangement used to bind Smad7. The analogous WW domains in the short isoform of Smurf1 recognize the Smad7 PY peptide using the same coupled mechanism. However, a longer Smurf1 isoform, which has an additional 26 residues in the inter-WW domain linker, is only partially able to use the coupled WW domain binding mechanism. The longer linker results in a decrease in affinity for the Smad7 peptide. Interdomain coupling of WW domains enhances selectivity and enables the tuning of interactions by isoform switching. PMID:20937913
Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

PubMed

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Nencki Genomics Database—Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs

PubMed Central

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456

Reconstruction and topological characterization of the sigma factor regulatory network of Mycobacterium tuberculosis

PubMed Central

Chauhan, Rinki; Ravi, Janani; Datta, Pratik; Chen, Tianlong; Schnappinger, Dirk; Bassler, Kevin E.; Balázsi, Gábor; Gennaro, Maria Laura

2016-01-01

Accessory sigma factors, which reprogram RNA polymerase to transcribe specific gene sets, activate bacterial adaptive responses to noxious environments. Here we reconstruct the complete sigma factor regulatory network of the human pathogen Mycobacterium tuberculosis by an integrated approach. The approach combines identification of direct regulatory interactions between M. tuberculosis sigma factors in an E. coli model system, validation of selected links in M. tuberculosis, and extensive literature review. The resulting network comprises 41 direct interactions among all 13 sigma factors. Analysis of network topology reveals (i) a three-tiered hierarchy initiating at master regulators, (ii) high connectivity and (iii) distinct communities containing multiple sigma factors. These topological features are likely associated with multi-layer signal processing and specialized stress responses involving multiple sigma factors. Moreover, the identification of overrepresented network motifs, such as autoregulation and coregulation of sigma and anti-sigma factor pairs, provides structural information that is relevant for studies of network dynamics. PMID:27029515
Evolution of UCP1 Transcriptional Regulatory Elements Across the Mammalian Phylogeny

PubMed Central

Gaudry, Michael J.; Campbell, Kevin L.

2017-01-01

Uncoupling protein 1 (UCP1) permits non-shivering thermogenesis (NST) when highly expressed in brown adipose tissue (BAT) mitochondria. Exclusive to placental mammals, BAT has commonly been regarded to be advantageous for thermoregulation in hibernators, small-bodied species, and the neonates of larger species. While numerous regulatory control motifs associated with UCP1 transcription have been proposed for murid rodents, it remains unclear whether these are conserved across the eutherian mammal phylogeny and hence essential for UCP1 expression. To address this shortcoming, we conducted a broad comparative survey of putative UCP1 transcriptional regulatory elements in 139 mammals (135 eutherians). We find no evidence for presence of a UCP1 enhancer in monotremes and marsupials, supporting the hypothesis that this control region evolved in a stem eutherian ancestor. We additionally reveal that several putative promoter elements (e.g., CRE-4, CCAAT) identified in murid rodents are not conserved among BAT-expressing eutherians, and together with the putative regulatory region (PRR) and CpG island do not appear to be crucial for UCP1 expression. The specificity and importance of the upTRE, dnTRE, URE1, CRE-2, RARE-2, NBRE, BRE-1, and BRE-2 enhancer elements first described from rats and mice are moreover uncertain as these motifs differ substantially—but generally remain highly conserved—in other BAT-expressing eutherians. Other UCP1 enhancer motifs (CRE-3, PPRE, and RARE-3) as well as the TATA box are also highly conserved in nearly all eutherian lineages with an intact UCP1. While these transcriptional regulatory motifs are generally also maintained in species where this gene is pseudogenized, the loss or degeneration of key basal promoter (e.g., TATA box) and enhancer elements in other UCP1-lacking lineages make it unlikely that the enhancer region is pleiotropic (i.e., co-regulates additional genes). Importantly, differential losses of (or mutations within
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
A Three-Dimensional RNA Motif in Potato spindle tuber viroid Mediates Trafficking from Palisade Mesophyll to Spongy Mesophyll in Nicotiana benthamiana[W

PubMed Central

Takeda, Ryuta; Petrov, Anton I.; Leontis, Neocles B.; Ding, Biao

2011-01-01

Cell-to-cell trafficking of RNA is an emerging biological principle that integrates systemic gene regulation, viral infection, antiviral response, and cell-to-cell communication. A key mechanistic question is how an RNA is specifically selected for trafficking from one type of cell into another type. Here, we report the identification of an RNA motif in Potato spindle tuber viroid (PSTVd) required for trafficking from palisade mesophyll to spongy mesophyll in Nicotiana benthamiana leaves. This motif, called loop 6, has the sequence 5′-CGA-3′...5′-GAC-3′ flanked on both sides by cis Watson-Crick G/C and G/U wobble base pairs. We present a three-dimensional (3D) structural model of loop 6 that specifies all non-Watson-Crick base pair interactions, derived by isostericity-based sequence comparisons with 3D RNA motifs from the RNA x-ray crystal structure database. The model is supported by available chemical modification patterns, natural sequence conservation/variations in PSTVd isolates and related species, and functional characterization of all possible mutants for each of the loop 6 base pairs. Our findings and approaches have broad implications for studying the 3D RNA structural motifs mediating trafficking of diverse RNA species across specific cellular boundaries and for studying the structure-function relationships of RNA motifs in other biological processes. PMID:21258006
Allosteric Breakage of the Hydrogen Bond within the Dual-Histidine Motif in the Active Site of Human Pin1 PPIase.

PubMed

Wang, Jing; Tochio, Naoya; Kawasaki, Ryosuke; Tamari, Yu; Xu, Ning; Uewaki, Jun-Ichi; Utsunomiya-Tate, Naoko; Tate, Shin-Ichi

2015-08-25

Intimate cooperativity among active site residues in enzymes is a key factor for regulating elaborate reactions that would otherwise not occur readily. Peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 (Pin1) is the phosphorylation-dependent cis-trans peptidyl-prolyl isomerase (PPIase) that specifically targets phosphorylated Ser/Thr-Pro motifs. Residues C113, H59, H157, and T152 form a hydrogen bond network in the active site, as in the noted connection. Theoretical studies have shown that protonation to thiolate C113 leads to rearrangement of this hydrogen bond network, with switching of the tautomeric states of adjacent histidines (H59 and H157) [Barman, A., and Hamelberg, D. (2014) Biochemistry 53, 3839-3850]. This is called the "dual-histidine motif". Here, C113A and C113S Pin1 mutants were found to alter the protonation states of H59 according to the respective residue type replaced at C113, and the mutations resulted in disruption of the hydrogen bond within the dual-histidine motif. In the C113A mutant, H59 was observed to be in exchange between ε- and δ-tautomers, which widened the entrance of the active site cavity, as seen by an increase in the distance between residues A113 and S154. The C113S mutant caused H59 to exchange between the ε-tautomer and imidazolium while not changing the active site structure. Moreover, the imidazole ring orientations of H59 and H157 were changed in the C113S mutant. These results demonstrated that a mutation at C113 modulates the hydrogen bond network dynamics. Thus, C113 acts as a pivot to drive the concerted function among the residues in the hydrogen bond network, as theoretically predicted.
Enantioselective synthesis of cis-decalins using organocatalysis and sulfonyl Nazarov reagents.

PubMed

Peña, Javier; Silveira-Dorta, Gastón; Moro, Rosalina F; Garrido, Narciso M; Marcos, Isidro S; Sanz, Francisca; Díez, David

2015-04-10

The first organocatalytic synthesis of cis-decalins using sulfonyl Nazarov reagents is reported. The Jørgensen's catalyst directs this highly enantioselective synthesis using different cyclohexenal derivatives.
Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis

2005-06-13

Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A totalmore » of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.« less
Transcription factor ThWRKY4 binds to a novel WLS motif and a RAV1A element in addition to the W-box to regulate gene expression.

PubMed

Xu, Hongyun; Shi, Xinxin; Wang, Zhibo; Gao, Caiqiu; Wang, Chao; Wang, Yucheng

2017-08-01

WRKY transcription factors play important roles in many biological processes, and mainly bind to the W-box element to regulate gene expression. Previously, we characterized a WRKY gene from Tamarix hispida, ThWRKY4, in response to abiotic stress, and showed that it bound to the W-box motif. However, whether ThWRKY4 could bind to other motifs remains unknown. In this study, we employed a Transcription Factor-Centered Yeast one Hybrid (TF-Centered Y1H) screen to study the motifs recognized by ThWRKY4. In addition to the W-box core cis-element (termed W-box), we identified that ThWRKY4 could bind to two other motifs: the RAV1A element (CAACA) and a novel motif with sequence of GTCTA (W-box like sequence, WLS). The distributions of these motifs were screened in the promoter regions of genes regulated by some WRKYs. The results showed that the W-box, RAV1A, and WLS motifs were all present in high numbers, suggesting that they play key roles in gene expression mediated by WRKYs. Furthermore, five WRKY proteins from different WRKY subfamilies in Arabidopsis thaliana were selected and confirmed to bind to the RAV1A and WLS motifs, indicating that they are recognized commonly by WRKYs. These findings will help to further reveal the functions of WRKY proteins. Copyright © 2017 Elsevier B.V. All rights reserved.
Co-regulation analysis of co-expressed modules under cold and pathogen stress conditions in tomato.

PubMed

Abedini, Davar; Rashidi Monfared, Sajad

2018-06-01

A primary mechanism for controlling the development of multicellular organisms is transcriptional regulation, which carried out by transcription factors (TFs) that recognize and bind to their binding sites on promoter region. The distance from translation start site, order, orientation, and spacing between cis elements are key factors in the concentration of active nuclear TFs and transcriptional regulation of target genes. In this study, overrepresented motifs in cold and pathogenesis responsive genes were scanned via Gibbs sampling method, this method is based on detection of overrepresented motifs by means of a stochastic optimization strategy that searches for all possible sets of short DNA segments. Then, identified motifs were checked by TRANSFAC, PLACE and Soft Berry databases in order to identify putative TFs which, interact to the motifs. Several cis/trans regulatory elements were found using these databases. Moreover, cross-talk between cold and pathogenesis responsive genes were confirmed. Statistical analysis was used to determine distribution of identified motifs on promoter region. In addition, co-regulation analysis results, illustrated genes in pathogenesis responsive module are divided into two main groups. Also, promoter region was crunched to six subareas in order to draw the pattern of distribution of motifs in promoter subareas. The result showed the majority of motifs are concentrated on 700 nucleotides upstream of the translational start site (ATG). In contrast, this result isn't true in another group. In other words, there was no difference between total and compartmentalized regions in cold responsive genes.
Prediction of transcriptional regulatory elements for plant hormone responses based on microarray data

PubMed Central

2011-01-01

Background Phytohormones organize plant development and environmental adaptation through cell-to-cell signal transduction, and their action involves transcriptional activation. Recent international efforts to establish and maintain public databases of Arabidopsis microarray data have enabled the utilization of this data in the analysis of various phytohormone responses, providing genome-wide identification of promoters targeted by phytohormones. Results We utilized such microarray data for prediction of cis-regulatory elements with an octamer-based approach. Our test prediction of a drought-responsive RD29A promoter with the aid of microarray data for response to drought, ABA and overexpression of DREB1A, a key regulator of cold and drought response, provided reasonable results that fit with the experimentally identified regulatory elements. With this succession, we expanded the prediction to various phytohormone responses, including those for abscisic acid, auxin, cytokinin, ethylene, brassinosteroid, jasmonic acid, and salicylic acid, as well as for hydrogen peroxide, drought and DREB1A overexpression. Totally 622 promoters that are activated by phytohormones were subjected to the prediction. In addition, we have assigned putative functions to 53 octamers of the Regulatory Element Group (REG) that have been extracted as position-dependent cis-regulatory elements with the aid of their feature of preferential appearance in the promoter region. Conclusions Our prediction of Arabidopsis cis-regulatory elements for phytohormone responses provides guidance for experimental analysis of promoters to reveal the basis of the transcriptional network of phytohormone responses. PMID:21349196
The cis-regulatory element CCACGTGG is involved in ABA and water-stress responses of the maize gene rab28.

PubMed

Pla, M; Vilardell, J; Guiltinan, M J; Marcotte, W R; Niogret, M F; Quatrano, R S; Pagès, M

1993-01-01

The maize gene rab28 has been identified as ABA-inducible in embryos and vegetative tissues. It is also induced by water stress in young leaves. The proximal promoter region contains the conserved cis-acting element CCACGTGG (ABRE) reported for ABA induction in other plant genes. Transient expression assays in rice protoplasts indicate that a 134 bp fragment (-194 to -60 containing the ABRE) fused to a truncated cauliflower mosaic virus promoter (35S) is sufficient to confer ABA-responsiveness upon the GUS reporter gene. Gel retardation experiments indicate that nuclear proteins from tissues in which the rab28 gene is expressed can interact specifically with this 134 bp DNA fragment. Nuclear protein extracts from embryo and water-stressed leaves generate specific complexes of different electrophoretic mobility which are stable in the presence of detergent and high salt. However, by DMS footprinting the same guanine-specific contacts with the ABRE in both the embryo and leaf binding activities were detected. These results indicate that the rab28 promoter sequence CCACGTGG is a functional ABA-responsive element, and suggest that distinct regulatory factors with apparent similar affinity for the ABRE sequence may be involved in the hormone action during embryo development and in vegetative tissues subjected to osmotic stress.
Motif formation and industry specific topologies in the Japanese business firm network

NASA Astrophysics Data System (ADS)

Maluck, Julian; Donner, Reik V.; Takayasu, Hideki; Takayasu, Misako

2017-05-01

Motifs and roles are basic quantities for the characterization of interactions among 3-node subsets in complex networks. In this work, we investigate how the distribution of 3-node motifs can be influenced by modifying the rules of an evolving network model while keeping the statistics of simpler network characteristics, such as the link density and the degree distribution, invariant. We exemplify this problem for the special case of the Japanese Business Firm Network, where a well-studied and relatively simple yet realistic evolving network model is available, and compare the resulting motif distribution in the real-world and simulated networks. To better approximate the motif distribution of the real-world network in the model, we introduce both subgraph dependent and global additional rules. We find that a specific rule that allows only for the merging process between nodes with similar link directionality patterns reduces the observed excess of densely connected motifs with bidirectional links. Our study improves the mechanistic understanding of motif formation in evolving network models to better describe the characteristic features of real-world networks with a scale-free topology.
Efficient exact motif discovery.

PubMed

Marschall, Tobias; Rahmann, Sven

2009-06-15

The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. The method has been implemented in Java. It can be obtained from http://ls11-www.cs.tu-dortmund.de/people/marschal/paa_md/.
PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

PubMed Central

Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

2007-01-01

PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232
PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation.

PubMed

Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

2007-01-01

PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at http://www.pazar.info, is open for business.
Stability Depends on Positive Autoregulation in Boolean Gene Regulatory Networks

PubMed Central

Pinho, Ricardo; Garcia, Victor; Irimia, Manuel; Feldman, Marcus W.

2014-01-01

Network motifs have been identified as building blocks of regulatory networks, including gene regulatory networks (GRNs). The most basic motif, autoregulation, has been associated with bistability (when positive) and with homeostasis and robustness to noise (when negative), but its general importance in network behavior is poorly understood. Moreover, how specific autoregulatory motifs are selected during evolution and how this relates to robustness is largely unknown. Here, we used a class of GRN models, Boolean networks, to investigate the relationship between autoregulation and network stability and robustness under various conditions. We ran evolutionary simulation experiments for different models of selection, including mutation and recombination. Each generation simulated the development of a population of organisms modeled by GRNs. We found that stability and robustness positively correlate with autoregulation; in all investigated scenarios, stable networks had mostly positive autoregulation. Assuming biological networks correspond to stable networks, these results suggest that biological networks should often be dominated by positive autoregulatory loops. This seems to be the case for most studied eukaryotic transcription factor networks, including those in yeast, flies and mammals. PMID:25375153
Counting motifs in dynamic networks.

PubMed

Mukherjee, Kingshuk; Hasan, Md Mahmudul; Boucher, Christina; Kahveci, Tamer

2018-04-11

A network motif is a sub-network that occurs frequently in a given network. Detection of such motifs is important since they uncover functions and local properties of the given biological network. Finding motifs is however a computationally challenging task as it requires solving the costly subgraph isomorphism problem. Moreover, the topology of biological networks change over time. These changing networks are called dynamic biological networks. As the network evolves, frequency of each motif in the network also changes. Computing the frequency of a given motif from scratch in a dynamic network as the network topology evolves is infeasible, particularly for large and fast evolving networks. In this article, we design and develop a scalable method for counting the number of motifs in a dynamic biological network. Our method incrementally updates the frequency of each motif as the underlying network's topology evolves. Our experiments demonstrate that our method can update the frequency of each motif in orders of magnitude faster than counting the motif embeddings every time the network changes. If the network evolves more frequently, the margin with which our method outperforms the existing static methods, increases. We evaluated our method extensively using synthetic and real datasets, and show that our method is highly accurate(≥ 96%) and that it can be scaled to large dense networks. The results on real data demonstrate the utility of our method in revealing interesting insights on the evolution of biological processes.
Identification of choriogenin cis-regulatory elements and production of estrogen-inducible, liver-specific transgenic Medaka.

PubMed

Ueno, Tetsuro; Yasumasu, Shigeki; Hayashi, Shinji; Iuchi, Ichiro

2004-07-01

Choriogenins (chg-H, chg-L) are precursor proteins of egg envelope of medaka and synthesized in the spawning female liver in response to estrogen. We linked a gene construct chg-L1.5 kb/GFP (a 1.5 kb 5'-upstream region of the chg-L gene fused with a green fluorescence protein (GFP) gene) to another construct emgb/RFP (a cis-regulatory region of embryonic globin gene fused with an RFP gene), injected the double fusion gene construct into 1- or 2-cell-stage embryos, and selected embryos expressing the RFP in erythroid cells. From the embryos, we established two lines of chg-L1.5 kb/GFP-emgb/RFP-transgenic medaka. The 3-month-old spawning females and estradiol-17beta (E2)-exposed males displayed the liver-specific GFP expression. The E2-dependent GFP expression was detected in the differentiating liver of the stage 37-38 embryos. In addition, RT-PCR and whole-mount in situ hybridization showed that the E2-dependent chg expression was found in the liver of the stage 34 embryos of wild medaka, suggesting that such E2-dependency is achieved shortly after differentiation of the liver. Analysis using serial deletion mutants fused with GFP showed that the region -426 to -284 of the chg-L gene or the region -364 to -265 of the chg-H gene had the ability to promote the E2-dependent liver-specific GFP expression of its downstream gene. Further analyses suggested that an estrogen response element (ERE) at -309, an ERE half-site at -330 and a binding site for C/EBP at -363 of the chg-L gene played important roles in its downstream chg-L gene expression. In addition, this transgenic medaka may be useful as one of the test animals for detecting environmental estrogenic steroids.
Transcriptional regulation of human eosinophil RNases by an evolutionary- conserved sequence motif in primate genome

PubMed Central

Wang, Hsiu-Yu; Chang, Hao-Teng; Pai, Tun-Wen; Wu, Chung-I; Lee, Yuan-Hung; Chang, Yen-Hsin; Tai, Hsiu-Ling; Tang, Chuan-Yi; Chou, Wei-Yao; Chang, Margaret Dah-Tsyr

2007-01-01

Background Human eosinophil-derived neurotoxin (edn) and eosinophil cationic protein (ecp) are members of a subfamily of primate ribonuclease (rnase) genes. Although they are generated by gene duplication event, distinct edn and ecp expression profile in various tissues have been reported. Results In this study, we obtained the upstream promoter sequences of several representative primate eosinophil rnases. Bioinformatic analysis revealed the presence of a shared 34-nucleotide (nt) sequence stretch located at -81 to -48 in all edn promoters and macaque ecp promoter. Such a unique sequence motif constituted a region essential for transactivation of human edn in hepatocellular carcinoma cells. Gel electrophoretic mobility shift assay, transient transfection and scanning mutagenesis experiments allowed us to identify binding sites for two transcription factors, Myc-associated zinc finger protein (MAZ) and SV-40 protein-1 (Sp1), within the 34-nt segment. Subsequent in vitro and in vivo binding assays demonstrated a direct molecular interaction between this 34-nt region and MAZ and Sp1. Interestingly, overexpression of MAZ and Sp1 respectively repressed and enhanced edn promoter activity. The regulatory transactivation motif was mapped to the evolutionarily conserved -74/-65 region of the edn promoter, which was guanidine-rich and critical for recognition by both transcription factors. Conclusion Our results provide the first direct evidence that MAZ and Sp1 play important roles on the transcriptional activation of the human edn promoter through specific binding to a 34-nt segment present in representative primate eosinophil rnase promoters. PMID:17927842

Synthesis of cis-4-trifluoromethyl- and cis-4-difluoromethyl-l-pyroglutamic acids.

PubMed

Qiu, Xiao-Long; Qing, Feng-Ling

2003-05-02

Efforts to synthesize 4-trifluoromethyl- and 4-difluoromethyl-l-pyroglutamic acids are described. After many arduous efforts, we successfully synthesized our target molecules cis-4-trifluoromethyl-l-pyroglutamic acid 25 and cis-4-difluoromethyl-l-pyroglutamic acid 26 from trans-4-hydroxy-l-proline through oxidation of fluorinated prolinates with RuO(4).
A prior-based integrative framework for functional transcriptional regulatory network inference

PubMed Central

Siahpirani, Alireza F.

2017-01-01

Abstract Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization. PMID:27794550
Regulatory ozone modeling: status, directions, and research needs.

PubMed Central

Georgopoulos, P G

1995-01-01

The Clean Air Act Amendments (CAAA) of 1990 have established selected comprehensive, three-dimensional, Photochemical Air Quality Simulation Models (PAQSMs) as the required regulatory tools for analyzing the urban and regional problem of high ambient ozone levels across the United States. These models are currently applied to study and establish strategies for meeting the National Ambient Air Quality Standard (NAAQS) for ozone in nonattainment areas; State Implementation Plans (SIPs) resulting from these efforts must be submitted to the U.S. Environmental Protection Agency (U.S. EPA) in November 1994. The following presentation provides an overview and discussion of the regulatory ozone modeling process and its implications. First, the PAQSM-based ozone attainment demonstration process is summarized in the framework of the 1994 SIPs. Then, following a brief overview of the representation of physical and chemical processes in PAQSMs, the essential attributes of standard modeling systems currently in regulatory use are presented in a nonmathematical, self-contained format, intended to provide a basic understanding of both model capabilities and limitations. The types of air quality, emission, and meteorological data needed for applying and evaluating PAQSMs are discussed, as well as the sources, availability, and limitations of existing databases. The issue of evaluating a model's performance in order to accept it as a tool for policy making is discussed, and various methodologies for implementing this objective are summarized. Selected interim results from diagnostic analyses, which are performed as a component of the regulatory ozone modeling process for the Philadelphia-New Jersey region, are also presented to provide some specific examples related to the general issues discussed in this work. Finally, research needs related to a) the evaluation and refinement of regulatory ozone modeling, b) the characterization of uncertainty in photochemical modeling, and c
A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

NASA Technical Reports Server (NTRS)

Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar;

2002-01-01

We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

cis-1,2-Dichloroethylene

Integrated Risk Information System (IRIS)

EPA / 635 / R - 09 / 006 F www.epa.gov / iris TOXICOLOGICAL REVIEW OF cis - 1,2 - DICHLOROETHYLENE and trans - 1,2 - DICHLOROETHYLENE ( CAS Nos . cis : 156 - 59 - 2 ; trans : 156 - 60 - 5 ; mixture : 540 - 59 - 0 ) In Support of Summary Information on the Integrated Risk Information System ( IRIS )
Shared regulatory sites are abundant in the human genome and shed light on genome evolution and disease pleiotropy.

PubMed

Tong, Pin; Monahan, Jack; Prendergast, James G D

2017-03-01

Large-scale gene expression datasets are providing an increasing understanding of the location of cis-eQTLs in the human genome and their role in disease. However, little is currently known regarding the extent of regulatory site-sharing between genes. This is despite it having potentially wide-ranging implications, from the determination of the way in which genetic variants may shape multiple phenotypes to the understanding of the evolution of human gene order. By first identifying the location of non-redundant cis-eQTLs, we show that regulatory site-sharing is a relatively common phenomenon in the human genome, with over 10% of non-redundant regulatory variants linked to the expression of multiple nearby genes. We show that these shared, local regulatory sites are linked to high levels of chromatin looping between the regulatory sites and their associated genes. In addition, these co-regulated gene modules are found to be strongly conserved across mammalian species, suggesting that shared regulatory sites have played an important role in shaping human gene order. The association of these shared cis-eQTLs with multiple genes means they also appear to be unusually important in understanding the genetics of human phenotypes and pleiotropy, with shared regulatory sites more often linked to multiple human phenotypes than other regulatory variants. This study shows that regulatory site-sharing is likely an underappreciated aspect of gene regulation and has important implications for the understanding of various biological phenomena, including how the two and three dimensional structures of the genome have been shaped and the potential causes of disease pleiotropy outside coding regions.
The 3’-Jα Region of the TCRα Locus Bears Gene Regulatory Activity in Thymic and Peripheral T Cells

PubMed Central

Kučerová-Levisohn, Martina; Knirr, Stefan; Mejia, Rosa I.; Ortiz, Benjamin D.

2015-01-01

Much progress has been made in understanding the important cis-mediated controls on mouse TCRα gene function, including identification of the Eα enhancer and TCRα locus control region (LCR). Nevertheless, previous data have suggested that other cis-regulatory elements may reside in the locus outside of the Eα/LCR. Based on prior findings, we hypothesized the existence of gene regulatory elements in a 3.9-kb region 5’ of the Cα exons. Using DNase hypersensitivity assays and TCRα BAC reporter transgenes in mice, we detected gene regulatory activity within this 3.9-kb region. This region is active in both thymic and peripheral T cells, and selectively affects upstream, but not downstream, gene expression. Together, these data indicate the existence of a novel cis-acting regulatory complex that contributes to TCRα transgene expression in vivo. The active chromatin sites we discovered within this region would remain in the locus after TCRα gene rearrangement, and thus may contribute to endogenous TCRα gene activity, particularly in peripheral T cells, where the Eα element has been found to be inactive. PMID:26177549
[Cover motifs of the Tidsskrift. A 14-year cavalcade].

PubMed

Nylenna, M

1998-12-10

In 1985 the Journal of the Norwegian Medical Association changed its cover policy, moving the table of contents inside the Journal and introducing cover illustrations. This article provides an analysis of all cover illustrations published over this 14-year period, 420 covers in all. There is a great variation in cover motifs and designs and a development towards more general motifs. The initial emphasis on historical and medical aspects is now less pronounced, while the use of works of art and nature motifs has increased, and the cover now more often has a direct bearing on the specific contents of the issue. Professor of medical history Oivind Larsen has photographed two thirds of the covers and contributed 95% of the inside essay-style reflections on the cover motif. Over the years, he has expanded the role of the historian of medicine disseminating knowledge to include that of the raconteur with a personal tone of voice. The Journal's covers are now one of its most characteristic features, emblematic of the Journal's ambition of standing for quality and timelessness vis-à-vis the news media, and of its aim of bridging the gap between medicine and the humanities.
Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis

PubMed Central

Yang, Fan; Wang, Jiebiao; Pierce, Brandon L.; Chen, Lin S.

2017-01-01

The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes (cis-eQTLs). More research is needed to identify effects of genetic variation on distant genes (trans-eQTLs) and understand their biological mechanisms. One common trans-eQTLs mechanism is “mediation” by a local (cis) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are “cis-mediators” of trans-eQTLs, including those “cis-hubs” involved in regulation of many trans-genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans-eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis-mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis-hubs and trans-eQTL regulation across tissue types. PMID:29021290
Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis.

PubMed

Yang, Fan; Wang, Jiebiao; Pierce, Brandon L; Chen, Lin S

2017-11-01

The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes ( cis -eQTLs). More research is needed to identify effects of genetic variation on distant genes ( trans -eQTLs) and understand their biological mechanisms. One common trans -eQTLs mechanism is "mediation" by a local ( cis ) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are " cis -mediators" of trans -eQTLs, including those " cis -hubs" involved in regulation of many trans -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis -hubs and trans -eQTL regulation across tissue types. © 2017 Yang et al.; Published by Cold Spring Harbor Laboratory Press.
Trans‐acting translational regulatory RNA binding proteins

PubMed Central

Harvey, Robert F.; Smith, Tom S.; Mulroney, Thomas; Queiroz, Rayner M. L.; Pizzinga, Mariavittoria; Dezi, Veronica; Villenueva, Eneko; Ramakrishna, Manasa

2018-01-01

The canonical molecular machinery required for global mRNA translation and its control has been well defined, with distinct sets of proteins involved in the processes of translation initiation, elongation and termination. Additionally, noncanonical, trans‐acting regulatory RNA‐binding proteins (RBPs) are necessary to provide mRNA‐specific translation, and these interact with 5′ and 3′ untranslated regions and coding regions of mRNA to regulate ribosome recruitment and transit. Recently it has also been demonstrated that trans‐acting ribosomal proteins direct the translation of specific mRNAs. Importantly, it has been shown that subsets of RBPs often work in concert, forming distinct regulatory complexes upon different cellular perturbation, creating an RBP combinatorial code, which through the translation of specific subsets of mRNAs, dictate cell fate. With the development of new methodologies, a plethora of novel RNA binding proteins have recently been identified, although the function of many of these proteins within mRNA translation is unknown. In this review we will discuss these methodologies and their shortcomings when applied to the study of translation, which need to be addressed to enable a better understanding of trans‐acting translational regulatory proteins. Moreover, we discuss the protein domains that are responsible for RNA binding as well as the RNA motifs to which they bind, and the role of trans‐acting ribosomal proteins in directing the translation of specific mRNAs. This article is categorized under: 1RNA Interactions with Proteins and Other Molecules > RNA–Protein Complexes2Translation > Translation Regulation3Translation > Translation Mechanisms PMID:29341429
Transcriptional Regulatory Networks in Saccharomyces cerevisiae

NASA Astrophysics Data System (ADS)

Lee, Tong Ihn; Rinaldi, Nicola J.; Robert, François; Odom, Duncan T.; Bar-Joseph, Ziv; Gerber, Georg K.; Hannett, Nancy M.; Harbison, Christopher T.; Thompson, Craig M.; Simon, Itamar; Zeitlinger, Julia; Jennings, Ezra G.; Murray, Heather L.; Gordon, D. Benjamin; Ren, Bing; Wyrick, John J.; Tagne, Jean-Bosco; Volkert, Thomas L.; Fraenkel, Ernest; Gifford, David K.; Young, Richard A.

2002-10-01

We have determined how most of the transcriptional regulators encoded in the eukaryote Saccharomyces cerevisiae associate with genes across the genome in living cells. Just as maps of metabolic networks describe the potential pathways that may be used by a cell to accomplish metabolic processes, this network of regulator-gene interactions describes potential pathways yeast cells can use to regulate global gene expression programs. We use this information to identify network motifs, the simplest units of network architecture, and demonstrate that an automated process can use motifs to assemble a transcriptional regulatory network structure. Our results reveal that eukaryotic cellular functions are highly connected through networks of transcriptional regulators that regulate other transcriptional regulators.
A Predictive Model of the Oxygen and Heme Regulatory Network in Yeast

PubMed Central

Kundaje, Anshul; Xin, Xiantong; Lan, Changgui; Lianoglou, Steve; Zhou, Mei; Zhang, Li; Leslie, Christina

2008-01-01

Deciphering gene regulatory mechanisms through the analysis of high-throughput expression data is a challenging computational problem. Previous computational studies have used large expression datasets in order to resolve fine patterns of coexpression, producing clusters or modules of potentially coregulated genes. These methods typically examine promoter sequence information, such as DNA motifs or transcription factor occupancy data, in a separate step after clustering. We needed an alternative and more integrative approach to study the oxygen regulatory network in Saccharomyces cerevisiae using a small dataset of perturbation experiments. Mechanisms of oxygen sensing and regulation underlie many physiological and pathological processes, and only a handful of oxygen regulators have been identified in previous studies. We used a new machine learning algorithm called MEDUSA to uncover detailed information about the oxygen regulatory network using genome-wide expression changes in response to perturbations in the levels of oxygen, heme, Hap1, and Co2+. MEDUSA integrates mRNA expression, promoter sequence, and ChIP-chip occupancy data to learn a model that accurately predicts the differential expression of target genes in held-out data. We used a novel margin-based score to extract significant condition-specific regulators and assemble a global map of the oxygen sensing and regulatory network. This network includes both known oxygen and heme regulators, such as Hap1, Mga2, Hap4, and Upc2, as well as many new candidate regulators. MEDUSA also identified many DNA motifs that are consistent with previous experimentally identified transcription factor binding sites. Because MEDUSA's regulatory program associates regulators to target genes through their promoter sequences, we directly tested the predicted regulators for OLE1, a gene specifically induced under hypoxia, by experimental analysis of the activity of its promoter. In each case, deletion of the candidate
Allosteric Fine-Tuning of the Binding Pocket Dynamics in the ITK SH2 Domain by a Distal Molecular Switch: An Atomistic Perspective.

PubMed

Momin, Mohamed; Xin, Yao; Hamelberg, Donald

2017-06-29

Although the regulation of function of proteins by allosteric interactions has been identified in many subcellular processes, molecular switches are also known to induce long-range conformational changes in proteins. A less well understood molecular switch involving cis-trans isomerization of a peptidyl-prolyl bond could induce a conformational change directly to the backbone that is propagated to other parts of the protein. However, these switches are elusive and hard to identify because they are intrinsic to biomolecules that are inherently dynamic. Here, we explore the conformational dynamics and free energy landscape of the SH2 domain of interleukin-2-inducible T-cell or tyrosine kinase (ITK) to fully understand the conformational coupling between the distal cis-trans molecular switch and its binding pocket of the phosphotyrosine motif. We use multiple microsecond-long all-atom molecular dynamics simulations in explicit water for over a total of 60 μs. We show that cis-trans isomerization of the Asn286-Pro287 peptidyl-prolyl bond is directly coupled to the dynamics of the binding pocket of the phosphotyrosine motif, in agreement with previous NMR experiments. Unlike the cis state that is localized and less dynamic in a single free energy basin, the trans state samples two distinct conformations of the binding pocket-one that recognizes the phosphotyrosine motif and the other that is somewhat similar to that of the cis state. The results provide an atomic-level description of a less well understood allosteric regulation by a peptidyl-prolyl cis-trans molecular switch that could aid in the understanding of normal and aberrant subcellular processes and the identification of these elusive molecular switches in other proteins.
Deletion of transcription factor binding motifs using the CRISPR/spCas9 system in the β-globin LCR.

PubMed

Kim, Yea Woon; Kim, AeRi

2017-07-20

Transcription factors play roles in gene transcription through direct binding to their motifs in genome, and inhibiting this binding provides an effective strategy for studying their roles. Here we applied the CRISPR/spCas9 system to mutate the binding motifs of transcription factors. Binding motifs for erythroid specific transcription factors were mutated in the locus control region hypersensitive sites of the human β-globin locus. Guide RNAs targeting binding motifs were cloned into lentiviral CRISPR vector containing the spCas9 gene, and transduced into MEL/ch11 cells carrying a human chromosome 11. DNA mutations in clonal cells were initially screened by quantitative PCR in genomic DNA and then clarified by sequencing. Mutations in binding motifs reduced occupancy by transcription factors in a chromatin environment. Characterization of mutations revealed that the CRISPR/spCas9 system mainly induced deletions in short regions of <20 bp and preferentially deleted nucleotides around the fifth nucleotide upstream of Protospacer adjacent motifs. These results indicate that the CRISPR/Cas9 system is suitable for mutating the binding motifs of transcription factors, and, consequently, would contribute to elucidate the direct roles of transcription factors. ©2017 The Author(s).
Cis-encoded non-coding antisense RNAs in streptococci and other low GC Gram (+) bacterial pathogens

PubMed Central

Cho, Kyu Hong; Kim, Jeong-Ho

2015-01-01

Due to recent advances of bioinformatics and high throughput sequencing technology, discovery of regulatory non-coding RNAs in bacteria has been increased to a great extent. Based on this bandwagon, many studies searching for trans-acting small non-coding RNAs in streptococci have been performed intensively, especially in the important human pathogen, group A and B streptococci. However, studies for cis-encoded non-coding antisense RNAs in streptococci have been scarce. A recent study shows antisense RNAs are involved in virulence gene regulation in group B streptococcus, S. agalactiae. This suggests antisense RNAs could have important roles in the pathogenesis of streptococcal pathogens. In this review, we describe recent discoveries of chromosomal cis-encoded antisense RNAs in streptococcal pathogens and other low GC Gram (+) bacteria to provide a guide for future studies. PMID:25859258
CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

NASA Astrophysics Data System (ADS)

Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

2014-12-01

Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.
Small-molecule RORγt antagonists inhibit T helper 17 cell transcriptional network by divergent mechanisms.

PubMed

Xiao, Sheng; Yosef, Nir; Yang, Jianfei; Wang, Yonghui; Zhou, Ling; Zhu, Chen; Wu, Chuan; Baloglu, Erkan; Schmidt, Darby; Ramesh, Radha; Lobera, Mercedes; Sundrud, Mark S; Tsai, Pei-Yun; Xiang, Zhijun; Wang, Jinsong; Xu, Yan; Lin, Xichen; Kretschmer, Karsten; Rahl, Peter B; Young, Richard A; Zhong, Zhong; Hafler, David A; Regev, Aviv; Ghosh, Shomir; Marson, Alexander; Kuchroo, Vijay K

2014-04-17

We identified three retinoid-related orphan receptor gamma t (RORγt)-specific inhibitors that suppress T helper 17 (Th17) cell responses, including Th17-cell-mediated autoimmune disease. We systemically characterized RORγt binding in the presence and absence of drugs with corresponding whole-genome transcriptome sequencing. RORγt acts as a direct activator of Th17 cell signature genes and a direct repressor of signature genes from other T cell lineages; its strongest transcriptional effects are on cis-regulatory sites containing the RORα binding motif. RORγt is central in a densely interconnected regulatory network that shapes the balance of T cell differentiation. Here, the three inhibitors modulated the RORγt-dependent transcriptional network to varying extents and through distinct mechanisms. Whereas one inhibitor displaced RORγt from its target loci, the other two inhibitors affected transcription predominantly without removing DNA binding. Our work illustrates the power of a system-scale analysis of transcriptional regulation to characterize potential therapeutic compounds that inhibit pathogenic Th17 cells and suppress autoimmunity. Copyright © 2014 Elsevier Inc. All rights reserved.
Hydroperoxide-dependent cooxidation of 13-cis-retinoic acid by prostaglandin H synthase.

PubMed

Samokyszyn, V M; Marnett, L J

1987-10-15

Reverse phase high pressure liquid chromatography was employed to separate the major products resulting from the hydroperoxide-dependent cooxidation of 13-cis-retinoic acid by microsomal and purified prostaglandin H (PGH) synthase. Several major oxygenated metabolites including 4-hydroxy-, 5,6-epoxy-, and 5,8-oxy-13-cis-retinoic acid were unambiguously identified on the basis of cochromatography with authentic standards, uv spectra, and mass spectral analysis. Identical product profiles were generated regardless of the type of oxidizing substrate employed, and heat-denatured microsomes or enzyme did not support oxidation. In addition, several geometric isomers including all trans-retinoic acid were identified. Isomerization to all trans-retinoic acid in microsomes occurred in the absence of exogenous hydroperoxide, was insensitive to inhibition by antioxidant, and was eliminated when heat-denatured preparations were substituted for intact microsomes. Conversely, isomerization to at least one other isomer required the addition of hydroperoxide and was sensitive to antioxidant inhibition. Addition of antioxidant to microsomal incubation mixtures inhibited the hydroperoxide-dependent generation of 5,6-epoxy- and 5,8-oxy-13-cis-retinoic acid and other oxygenated metabolites but stimulated the formation of 4-hydroxy-13-cis-retinoic acid. Under standard conditions, 77% of the original retinoid was metabolized resulting in products containing 1.25 oxygen atoms/oxygenated metabolite, and two dioxygen molecules were consumed per hydroperoxide reduced. Purified PGH synthase also supported O2 uptake during cooxidation of 13-cis-retinoic acid by H2O2 or 5-phenyl-4-pentenyl-1-hydroperoxide, and the initial velocities of O2 uptake were directly proportional to enzyme concentration. 13-cis-Retinoic acid effectively inhibited peroxidase-dependent cooxidation of guaiacol indicating a direct interaction of retinoid with peroxidase iron-oxo intermediates, and EPR spin trapping
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

PubMed Central

Zainudin, Suhaila; Arif, Shereena M.

2017-01-01

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5. PMID:28250767

Cis-Lunar Base Camp

NASA Technical Reports Server (NTRS)

Merrill, Raymond G.; Goodliff, Kandyce E.; Mazanek, Daniel D.; Reeves, John D., Jr.

2012-01-01

Historically, when mounting expeditions into uncharted territories, explorers have established strategically positioned base camps to pre-position required equipment and consumables. These base camps are secure, safe positions from which expeditions can depart when conditions are favorable, at which technology and operations can be tested and validated, and facilitate timely access to more robust facilities in the event of an emergency. For human exploration missions into deep space, cis-lunar space is well suited to serve as such a base camp. The outer regions of cis-lunar space, such as the Earth-Moon Lagrange points, lie near the edge of Earth s gravity well, allowing equipment and consumables to be aggregated with easy access to deep space and to the lunar surface, as well as more distant destinations, such as near-Earth Asteroids (NEAs) and Mars and its moons. Several approaches to utilizing a cis-lunar base camp for sustainable human exploration, as well as some possible future applications are identified. The primary objective of the analysis presented in this paper is to identify options, show the macro trends, and provide information that can be used as a basis for more detailed mission development. Compared within are the high-level performance and cost of 15 preliminary cis-lunar exploration campaigns that establish the capability to conduct crewed missions of up to one year in duration, and then aggregate mass in cis-lunar space to facilitate an expedition from Cis-Lunar Base Camp. Launch vehicles, chemical propulsion stages, and electric propulsion stages are discussed and parametric sizing values are used to create architectures of in-space transportation elements that extend the existing in-space supply chain to cis-lunar space. The transportation options to cis-lunar space assessed vary in efficiency by almost 50%; from 0.16 to 0.68 kg of cargo in cis-lunar space for every kilogram of mass in Low Earth Orbit (LEO). For the 15 cases, 5-year campaign
Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter

PubMed Central

Roux-Rouquie, Magali; Marilley, Monique

2000-01-01

We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X.laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed. PMID:10982860
Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter.

PubMed

Roux-Rouquie, M; Marilley, M

2000-09-15

We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X. laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed.
Ultrasensitive response motifs: basic amplifiers in molecular signalling networks

PubMed Central

Zhang, Qiang; Bhattacharya, Sudin; Andersen, Melvin E.

2013-01-01

Multi-component signal transduction pathways and gene regulatory circuits underpin integrated cellular responses to perturbations. A recurring set of network motifs serve as the basic building blocks of these molecular signalling networks. This review focuses on ultrasensitive response motifs (URMs) that amplify small percentage changes in the input signal into larger percentage changes in the output response. URMs generally possess a sigmoid input–output relationship that is steeper than the Michaelis–Menten type of response and is often approximated by the Hill function. Six types of URMs can be commonly found in intracellular molecular networks and each has a distinct kinetic mechanism for signal amplification. These URMs are: (i) positive cooperative binding, (ii) homo-multimerization, (iii) multistep signalling, (iv) molecular titration, (v) zero-order covalent modification cycle and (vi) positive feedback. Multiple URMs can be combined to generate highly switch-like responses. Serving as basic signal amplifiers, these URMs are essential for molecular circuits to produce complex nonlinear dynamics, including multistability, robust adaptation and oscillation. These dynamic properties are in turn responsible for higher-level cellular behaviours, such as cell fate determination, homeostasis and biological rhythm. PMID:23615029
Characterization of the Promoter Region of an Arabidopsis Gene for 9-cis-Epoxycarotenoid Dioxygenase Involved in Dehydration-Inducible Transcription

PubMed Central

Behnam, Babak; Iuchi, Satoshi; Fujita, Miki; Fujita, Yasunari; Takasaki, Hironori; Osakabe, Yuriko; Yamaguchi-Shinozaki, Kazuko; Kobayashi, Masatomo; Shinozaki, Kazuo

2013-01-01

Plants respond to dehydration stress and tolerate water-deficit status through complex physiological and cellular processes. Many genes are induced by water deficit. Abscisic acid (ABA) plays important roles in tolerance to dehydration stress by inducing many stress genes. ABA is synthesized de novo in response to dehydration. Most of the genes involved in ABA biosynthesis have been identified, and they are expressed mainly in leaf vascular tissues. Of the products of such genes, 9-cis-epoxycarotenoid dioxygenase (NCED) is a key enzyme in ABA biosynthesis. One of the five NCED genes in Arabidopsis, AtNCED3, is significantly induced by dehydration. To understand the regulatory mechanism of the early stages of the dehydration stress response, it is important to analyse the transcriptional regulatory systems of AtNCED3. In the present study, we found that an overlapping G-box recognition sequence (5′-CACGTG-3′) at −2248 bp from the transcriptional start site of AtNCED3 is an important cis-acting element in the induction of the dehydration response. We discuss the possible transcriptional regulatory system of dehydration-responsive AtNCED3 expression, and how this may control the level of ABA under water-deficit conditions. PMID:23604098
Control of Recombination Directionality by the Listeria Phage A118 Protein Gp44 and the Coiled-Coil Motif of Its Serine Integrase.

PubMed

Mandali, Sridhar; Gupta, Kushol; Dawson, Anthony R; Van Duyne, Gregory D; Johnson, Reid C

2017-06-01

The serine integrase of phage A118 catalyzes integrative recombination between attP on the phage and a specific attB locus on the chromosome of Listeria monocytogenes , but it is unable to promote excisive recombination between the hybrid attL and attR sites found on the integrated prophage without assistance by a recombination directionality factor (RDF). We have identified and characterized the phage-encoded RDF Gp44, which activates the A118 integrase for excision and inhibits integration. Gp44 binds to the C-terminal DNA binding domain of integrase, and we have localized the primary binding site to be within the mobile coiled-coil (CC) motif but distinct from the distal tip of the CC that is required for recombination. This interaction is sufficient to inhibit integration, but a second interaction involving the N-terminal end of Gp44 is also required to activate excision. We provide evidence that these two contacts modulate the trajectory of the CC motifs as they extend out from the integrase core in a manner dependent upon the identities of the four att sites. Our results support a model whereby Gp44 shapes the Int-bound complexes to control which att sites can synapse and recombine. IMPORTANCE Serine integrases mediate directional recombination between bacteriophage and bacterial chromosomes. These highly regulated site-specific recombination reactions are integral to the life cycle of temperate phage and, in the case of Listeria monocytogenes lysogenized by A118 family phage, are an essential virulence determinant. Serine integrases are also utilized as tools for genetic engineering and synthetic biology because of their exquisite unidirectional control of the DNA exchange reaction. Here, we identify and characterize the recombination directionality factor (RDF) that activates excision and inhibits integration reactions by the phage A118 integrase. We provide evidence that the A118 RDF binds to and modulates the trajectory of the long coiled-coil motif that
An internal regulatory element controls troponin I gene expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yutzey, K.E.; Kline, R.L.; Konieczmy, S.F.

1989-04-01

During skeletal myogenesis, approximately 20 contractile proteins and related gene products temporally accumulate as the cells fuse to form multinucleated muscle fibers. In most instances, the contractile protein genes are regulated transcriptionally, which suggests that a common molecular mechanism may coordinate the expression of this diverse and evolutionarily unrelated gene set. Recent studies have examined the muscle-specific cis-acting elements associated with numerous contractile protein genes. All of the identified regulatory elements are positioned in the 5'-flanking regions, usually within 1,500 base pairs of the transcription start site. Surprisingly, a DNA consensus sequence that is common to each contractile protein genemore » has not been identified. In contrast to the results of these earlier studies, the authors have found that the 5'-flanking region of the quail troponin I (TnI) gene is not sufficient to permit the normal myofiber transcriptional activation of the gene. Instead, the TnI gene utilizes a unique internal regulatory element that is responsible for the correct myofiber-specific expression pattern associated with the TnI gene. This is the first example in which a contractile protein gene has been shown to rely primarily on an internal regulatory element to elicit transcriptional activation during myogenesis. The diversity of regulatory elements associated with the contractile protein genes suggests that the temporal expression of the genes may involve individual cis-trans regulatory components specific for each gene.« less
An internal regulatory element controls troponin I gene expression.

PubMed Central

Yutzey, K E; Kline, R L; Konieczny, S F

1989-01-01

During skeletal myogenesis, approximately 20 contractile proteins and related gene products temporally accumulate as the cells fuse to form multinucleated muscle fibers. In most instances, the contractile protein genes are regulated transcriptionally, which suggests that a common molecular mechanism may coordinate the expression of this diverse and evolutionarily unrelated gene set. Recent studies have examined the muscle-specific cis-acting elements associated with numerous contractile protein genes. All of the identified regulatory elements are positioned in the 5'-flanking regions, usually within 1,500 base pairs of the transcription start site. Surprisingly, a DNA consensus sequence that is common to each contractile protein gene has not been identified. In contrast to the results of these earlier studies, we have found that the 5'-flanking region of the quail troponin I (TnI) gene is not sufficient to permit the normal myofiber transcriptional activation of the gene. Instead, the TnI gene utilizes a unique internal regulatory element that is responsible for the correct myofiber-specific expression pattern associated with the TnI gene. This is the first example in which a contractile protein gene has been shown to rely primarily on an internal regulatory element to elicit transcriptional activation during myogenesis. The diversity of regulatory elements associated with the contractile protein genes suggests that the temporal expression of the genes may involve individual cis-trans regulatory components specific for each gene. Images PMID:2725509
Identification of promoter motifs regulating ZmeIF4E expression level involved in maize rough dwarf disease resistance in maize (Zea Mays L.).

PubMed

Shi, Liyu; Weng, Jianfeng; Liu, Changlin; Song, Xinyuan; Miao, Hongqin; Hao, Zhuanfang; Xie, Chuanxiao; Li, Mingshun; Zhang, Degui; Bai, Li; Pan, Guangtang; Li, Xinhai; Zhang, Shihuang

2013-04-01

Maize rough dwarf disease (MRDD, a viral disease) results in significant grain yield losses, while genetic basis of which is largely unknown. Based on comparative genomics, eukaryotic translation initiation factor 4E (eIF4E) was considered as a candidate gene for MRDD resistance, validation of which will help to understand the possible genetic mechanism of this disease. ZmeIF4E (orthologs of eIF4E gene in maize) encodes a protein of 218 amino acids, harboring five exons and no variation in the cDNA sequence is identified between the resistant inbred line, X178 and susceptible one, Ye478. ZmeIF4E expression was different in the two lines plants treated with three plant hormones, ethylene, salicylic acid, and jasmonates at V3 developmental stage, suggesting that ZmeIF4E is more likely to be involved in the regulation of defense gene expression and induction of local and systemic resistance. Moreover, four cis-acting elements related to plant defense responses, including DOFCOREZM, EECCRCAH1, GT1GAMSCAM4, and GT1CONSENSUS were detected in ZmeIF4E promoter for harboring sequence variation in the two lines. Association analysis with 163 inbred lines revealed that one SNP in EECCRCAH1 is significantly associated with CSI of MRDD in two environments, which explained 3.33 and 9.04 % of phenotypic variation, respectively. Meanwhile, one SNP in GT-1 motif was found to affect MRDD resistance only in one of the two environments, which explained 5.17 % of phenotypic variation. Collectively, regulatory motifs respectively harboring the two significant SNPs in ZmeIF4E promoter could be involved in the defense process of maize after viral infection. These results contribute to understand maize defense mechanisms against maize rough dwarf virus.
A Lettuce (Lactuca sativa) Homolog of Human Nogo-B Receptor Interacts with cis-Prenyltransferase and Is Necessary for Natural Rubber Biosynthesis*

PubMed Central

Qu, Yang; Chakrabarty, Romit; Tran, Hue T.; Kwon, Eun-Joo G.; Kwon, Moonhyuk; Nguyen, Trinh-Don; Ro, Dae-Kyun

2015-01-01

Natural rubber (cis-1,4-polyisoprene) is an indispensable biopolymer used to manufacture diverse consumer products. Although a major source of natural rubber is the rubber tree (Hevea brasiliensis), lettuce (Lactuca sativa) is also known to synthesize natural rubber. Here, we report that an unusual cis-prenyltransferase-like 2 (CPTL2) that lacks the conserved motifs of conventional cis-prenyltransferase is required for natural rubber biosynthesis in lettuce. CPTL2, identified from the lettuce rubber particle proteome, displays homology to a human NogoB receptor and is predominantly expressed in latex. Multiple transgenic lettuces expressing CPTL2-RNAi constructs showed that a decrease of CPTL2 transcripts (3–15% CPTL2 expression relative to controls) coincided with the reduction of natural rubber as low as 5%. We also identified a conventional cis-prenyltransferase 3 (CPT3), exclusively expressed in latex. In subcellular localization studies using fluorescent proteins, cytosolic CPT3 was relocalized to endoplasmic reticulum by co-occurrence of CPTL2 in tobacco and yeast at the log phase. Furthermore, yeast two-hybrid data showed that CPTL2 and CPT3 interact. Yeast microsomes containing CPTL2/CPT3 showed enhanced synthesis of short cis-polyisoprenes, but natural rubber could not be synthesized in vitro. Intriguingly, a homologous pair CPTL1/CPT1, which displays ubiquitous expressions in lettuce, showed a potent dolichol biosynthetic activity in vitro. Taken together, our data suggest that CPTL2 is a scaffolding protein that tethers CPT3 on endoplasmic reticulum and is necessary for natural rubber biosynthesis in planta, but yeast-expressed CPTL2 and CPT3 alone could not synthesize high molecular weight natural rubber in vitro. PMID:25477521
Rapid motif compliance scoring with match weight sets.

PubMed

Venezia, D; O'Hara, P J

1993-02-01

Most current implementations of motif matching in biological sequences have sacrificed the generality of weight matrix scoring for shorter runtimes. The program MOTIF incorporates a weight matrix and a rapid, backtracking tree-search algorithm to score motif compliance with greatly enhanced performance while placing no constraints on the motif. In addition, any positions within a motif can be marked as 'inviolate', thereby requiring an exact match. MOTIF allows a choice of regular expression formats and can use both motif and sequence libraries as either targets or queries. Nucleic acid sequences can optionally be translated by MOTIF in any frame(s) and used against peptide motifs.
Phosphatidic Acid Sequesters Sec18p from cis-SNARE Complexes to Inhibit Priming.

PubMed

Starr, Matthew L; Hurst, Logan R; Fratti, Rutilio A

2016-10-01

Yeast vacuole fusion requires the activation of cis-SNARE complexes through priming carried out by Sec18p/N-ethylmaleimide sensitive factor and Sec17p/α-SNAP. The association of Sec18p with vacuolar cis-SNAREs is regulated in part by phosphatidic acid (PA) phosphatase production of diacylglycerol (DAG). Inhibition of PA phosphatase activity blocks the transfer of membrane-associated Sec18p to SNAREs. Thus, we hypothesized that Sec18p associates with PA-rich membrane microdomains before transferring to cis-SNARE complexes upon PA phosphatase activity. Here, we examined the direct binding of Sec18p to liposomes containing PA or DAG. We found that Sec18p preferentially bound to liposomes containing PA compared with those containing DAG by approximately fivefold. Additionally, using a specific PA-binding domain blocked Sec18p binding to PA-liposomes and displaced endogenous Sec18p from isolated vacuoles. Moreover, the direct addition of excess PA blocked the priming activity of isolated vacuoles in a manner similar to chemically inhibiting PA phosphatase activity. These data suggest that the conversion of PA to DAG facilitates the recruitment of Sec18p to cis-SNAREs. Purified vacuoles from yeast lacking the PA phosphatase Pah1p showed reduced Sec18p association with cis-SNAREs and complementation with plasmid-encoded PAH1 or recombinant Pah1p restored the interaction. Taken together, this demonstrates that regulating PA concentrations by Pah1p activity controls SNARE priming by Sec18p. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Dual role of K ATP channel C-terminal motif in membrane targeting and metabolic regulation.

PubMed

Kline, Crystal F; Kurata, Harley T; Hund, Thomas J; Cunha, Shane R; Koval, Olha M; Wright, Patrick J; Christensen, Matthew; Anderson, Mark E; Nichols, Colin G; Mohler, Peter J

2009-09-29

The coordinated sorting of ion channels to specific plasma membrane domains is necessary for excitable cell physiology. K(ATP) channels, assembled from pore-forming (Kir6.x) and regulatory sulfonylurea receptor subunits, are critical electrical transducers of the metabolic state of excitable tissues, including skeletal and smooth muscle, heart, brain, kidney, and pancreas. Here we show that the C-terminal domain of Kir6.2 contains a motif conferring membrane targeting in primary excitable cells. Kir6.2 lacking this motif displays aberrant channel targeting due to loss of association with the membrane adapter ankyrin-B (AnkB). Moreover, we demonstrate that this Kir6.2 C-terminal AnkB-binding motif (ABM) serves a dual role in K(ATP) channel trafficking and membrane metabolic regulation and dysfunction in these pathways results in human excitable cell disease. Thus, the K(ATP) channel ABM serves as a previously unrecognized bifunctional touch-point for grading K(ATP) channel gating and membrane targeting and may play a fundamental role in controlling excitable cell metabolic regulation.
Piecing together cis-regulatory networks: insights from epigenomics studies in plants.

PubMed

Huang, Shao-Shan C; Ecker, Joseph R

2018-05-01

5-Methylcytosine, a chemical modification of DNA, is a covalent modification found in the genomes of both plants and animals. Epigenetic inheritance of phenotypes mediated by DNA methylation is well established in plants. Most of the known mechanisms of establishing, maintaining and modifying DNA methylation have been worked out in the reference plant Arabidopsis thaliana. Major functions of DNA methylation in plants include regulation of gene expression and silencing of transposable elements (TEs) and repetitive sequences, both of which have parallels in mammalian biology, involve interaction with the transcriptional machinery, and may have profound effects on the regulatory networks in the cell. Methylome and transcriptome dynamics have been investigated in development and environmental responses in Arabidopsis and agriculturally and ecologically important plants, revealing the interdependent relationship among genomic context, methylation patterns, and expression of TE and protein coding genes. Analyses of methylome variation among plant natural populations and species have begun to quantify the extent of genetic control of methylome variation vs. true epimutation, and model the evolutionary forces driving methylome evolution in both short and long time scales. The ability of DNA methylation to positively or negatively modulate binding affinity of transcription factors (TFs) provides a natural link from genome sequence and methylation changes to transcription. Technologies that allow systematic determination of methylation sensitivities of TFs, in native genomic and methylation context without confounding factors such as histone modifications, will provide baseline datasets for building cell-type- and individual-specific regulatory networks that underlie the establishment and inheritance of complex traits. This article is categorized under: Laboratory Methods and Technologies > Genetic/Genomic Methods Biological Mechanisms > Regulatory Biology. © 2017 Wiley
Nucleolin Mediates MicroRNA-directed CSF-1 mRNA Deadenylation but Increases Translation of CSF-1 mRNA*

PubMed Central

Woo, Ho-Hyung; Baker, Terri; Laszlo, Csaba; Chambers, Setsuko K.

2013-01-01

CSF-1 mRNA 3′UTR contains multiple unique motifs, including a common microRNA (miRNA) target in close proximity to a noncanonical G-quadruplex and AU-rich elements (AREs). Using a luciferase reporter system fused to CSF-1 mRNA 3′UTR, disruption of the miRNA target region, G-quadruplex, and AREs together dramatically increased reporter RNA levels, suggesting important roles for these cis-acting regulatory elements in the down-regulation of CSF-1 mRNA. We find that nucleolin, which binds both G-quadruplex and AREs, enhances deadenylation of CSF-1 mRNA, promoting CSF-1 mRNA decay, while having the capacity to increase translation of CSF-1 mRNA. Through interaction with the CSF-1 3′UTR miRNA common target, we find that miR-130a and miR-301a inhibit CSF-1 expression by enhancing mRNA decay. Silencing of nucleolin prevents the miRNA-directed mRNA decay, indicating a requirement for nucleolin in miRNA activity on CSF-1 mRNA. Downstream effects followed by miR-130a and miR-301a inhibition of directed cellular motility of ovarian cancer cells were found to be dependent on nucleolin. The paradoxical effects of nucleolin on miRNA-directed CSF-1 mRNA deadenylation and on translational activation were explored further. The nucleolin protein contains four acidic stretches, four RNA recognition motifs (RRMs), and nine RGG repeats. All three domains in nucleolin regulate CSF-1 mRNA and protein levels. RRMs increase CSF-1 mRNA, whereas the acidic and RGG domains decrease CSF-1 protein levels. This suggests that nucleolin has the capacity to differentially regulate both CSF-1 RNA and protein levels. Our finding that nucleolin interacts with Ago2 indirectly via RNA and with poly(A)-binding protein C (PABPC) directly suggests a nucleolin-Ago2-PABPC complex formation on mRNA. This complex is in keeping with our suggestion that nucleolin may work with PABPC as a double-edged sword on both mRNA deadenylation and translational activation. Our findings underscore the complexity of
Evaluating cis-2,6-Dimethylpiperidide (cis-DMP) as a Base Component in Lithium-Mediated Zincation Chemistry

PubMed Central

Armstrong, David R; Garden, Jennifer A; Kennedy, Alan R; Leenhouts, Sarah M; Mulvey, Robert E; O'Keefe, Philip; O'Hara, Charles T; Steven, Alan

2013-01-01

Most recent advances in metallation chemistry have centred on the bulky secondary amide 2,2,6,6-tetramethylpiperidide (TMP) within mixed metal, often ate, compositions. However, the precursor amine TMP(H) is rather expensive so a cheaper substitute would be welcome. Thus this study was aimed towards developing cheaper non-TMP based mixed-metal bases and, as cis-2,6-dimethylpiperidide (cis-DMP) was chosen as the alternative amide, developing cis-DMP zincate chemistry which has received meagre attention compared to that of its methyl-rich counterpart TMP. A new lithium diethylzincate, [(TMEDA)LiZn(cis-DMP)Et2] (TMEDA=N,N,N′,N′-tetramethylethylenediamine) has been synthesised by co-complexation of Li(cis-DMP), Et2Zn and TMEDA, and characterised by NMR (including DOSY) spectroscopy and X-ray crystallography, which revealed a dinuclear contact ion pair arrangement. By using N,N-diisopropylbenzamide as a test aromatic substrate, the deprotonative reactivity of [(TMEDA)LiZn(cis-DMP)Et2] has been probed and contrasted with that of the known but previously uninvestigated di-tert-butylzincate, [(TMEDA)LiZn(cis-DMP)tBu2]. The former was found to be the superior base (for example, producing the ortho-deuteriated product in respective yields of 78 % and 48 % following D2O quenching of zincated benzamide intermediates). An 88 % yield of 2-iodo-N,N-diisopropylbenzamide was obtained on reaction of two equivalents of the diethylzincate with the benzamide followed by iodination. Comparisons are also drawn using 1,1,1,3,3,3-hexamethyldisilazide (HMDS), diisopropylamide and TMP as the amide component in the lithium amide, Et2Zn and TMEDA system. Under certain conditions, the cis-DMP base system was found to give improved results in comparison to HMDS and diisopropylamide (DA), and comparable results to a TMP system. Two novel complexes isolated from reactions of the di-tert-butylzincate and crystallographically characterised, namely the pre-metallation complex [{(iPr)2N(Ph)C=O}LiZn(cis
Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids

PubMed Central

Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

2015-01-01

Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221
Statistical tests to compare motif count exceptionalities

PubMed Central

Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent

2007-01-01

Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
[Regulatory effect and mechanism of RNA binding motif protein 38 on the expression of progesterone receptor in human breast cancer ZR-75-1 cells].

PubMed

Lou, P P; Li, C L; Xia, T S; Shi, L; Wu, J; Zhou, X J; Wang, Y; Ding, Q

2016-06-23

To investigate the regulatory mechanism of RNA binding motif protein 38 (RNPC1) on the expression of progesterone receptor (PR) in breast cancer cell line ZR-75-1. Lentiviral vector was used to induce overexpression of RNPC1 in ZR-75-1 cells. qRT-PCR and Western blot were used to assess the regulatory effect of RNPC1 on PR expression. Actinomycin was used to detect the regulatory mechanism involved. Immunohistochemical (IHC) staining was used to determine the protein expression of RNPC1 and PR in 80 breast cancer tissues. IHC staining showed that the expression of RNPC1 was significantly higher in the PR positive breast cancer tissues than that in the PR negative breast cancer tissues (P<0.05). The qRT-PCR results showed that overexpression of RNPC1 in ZR-75-1 cells significantly upregulated the mRNA level of PR (1.764±0.028 vs. 1.001±0.037, P<0.01), whereas knockdown of RNPC1 did the opposite (0.579± 0.007 vs. 1.000±0.002, P<0.01). The Western blot results also showed that overexpression of RNPC1 up-regulated PR levels, while knockdown of RNPC1 resulted in down-regulation of PR levels in the ZR-75-1 cells.The actinomycin assay showed that overexpression of RNPC1 increased the mRNA stability of PR. The half-life of PR mRNA was increased from 4.0 h to 6.5 h. Knockdown of RNPC1 decreased the mRNA stability of PR and the half-life of PR transcript was decreased from 4.1 h to 3.0 h. RNPC1 plays a crucial role in regulating the expression of PR in breast cancer ZR-75-1 cells.
The CGTCA sequence motif is essential for biological activity of the vasoactive intestinal peptide gene cAMP-regulated enhancer.

PubMed Central

Fink, J S; Verhave, M; Kasper, S; Tsukada, T; Mandel, G; Goodman, R H

1988-01-01

cAMP-regulated transcription of the human vasoactive intestinal peptide gene is dependent upon a 17-base-pair DNA element located 70 base pairs upstream from the transcriptional initiation site. This element is similar to sequences in other genes known to be regulated by cAMP and to sequences in several viral enhancers. We have demonstrated that the vasoactive intestinal peptide regulatory element is an enhancer that depends upon the integrity of two CGTCA sequence motifs for biological activity. Mutations in either of the CGTCA motifs diminish the ability of the element to respond to cAMP. Enhancers containing the CGTCA motif from the somatostatin and adenovirus genes compete for binding of nuclear proteins from C6 glioma and PC12 cells to the vasoactive intestinal peptide enhancer, suggesting that CGTCA-containing enhancers interact with similar transacting factors. Images PMID:2842787

footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2014-01-15

Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Discriminative motif optimization based on perceptron training

PubMed Central

Patel, Ronak Y.; Stormo, Gary D.

2014-01-01

Motivation: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. Results: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. Availability and implementation: DiMO is available at http://stormo.wustl.edu/DiMO Contact: rpatel@genetics.wustl.edu, ronakypatel@gmail.com PMID:24369152
Unique ATPase site architecture triggers cis-mediated synchronized ATP binding in heptameric AAA+-ATPase domain of flagellar regulatory protein FlrC.

PubMed

Dey, Sanjay; Biswas, Maitree; Sen, Udayaditya; Dasgupta, Jhimli

2015-04-03

Bacterial enhancer-binding proteins (bEBPs) oligomerize through AAA(+) domains and use ATP hydrolysis-driven energy to isomerize the RNA polymerase-σ(54) complex during transcriptional initiation. Here, we describe the first structure of the central AAA(+) domain of the flagellar regulatory protein FlrC (FlrC(C)), a bEBP that controls flagellar synthesis in Vibrio cholerae. Our results showed that FlrC(C) forms heptamer both in nucleotide (Nt)-free and -bound states without ATP-dependent subunit remodeling. Unlike the bEBPs such as NtrC1 or PspF, a novel cis-mediated "all or none" ATP binding occurs in the heptameric FlrC(C), because constriction at the ATPase site, caused by loop L3 and helix α7, restricts the proximity of the trans-protomer required for Nt binding. A unique "closed to open" movement of Walker A, assisted by trans-acting "Glu switch" Glu-286, facilitates ATP binding and hydrolysis. Fluorescence quenching and ATPase assays on FlrC(C) and mutants revealed that although Arg-349 of sensor II, positioned by trans-acting Glu-286 and Tyr-290, acts as a key residue to bind and hydrolyze ATP, Arg-319 of α7 anchors ribose and controls the rate of ATP hydrolysis by retarding the expulsion of ADP. Heptameric state of FlrC(C) is restored in solution even with the transition state mimicking ADP·AlF3. Structural results and pulldown assays indicated that L3 renders an in-built geometry to L1 and L2 causing σ(54)-FlrC(C) interaction independent of Nt binding. Collectively, our results underscore a novel mechanism of ATP binding and σ(54) interaction that strives to understand the transcriptional mechanism of the bEBPs, which probably interact directly with the RNA polymerase-σ(54) complex without DNA looping. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Beyond Atg8 binding: The role of AIM/LIR motifs in autophagy.

PubMed

Fracchiolla, Dorotea; Sawa-Makarska, Justyna; Martens, Sascha

2017-05-04

Selective macroautophagy/autophagy mediates the selective delivery of cytoplasmic cargo material via autophagosomes into the lytic compartment for degradation. This selectivity is mediated by cargo receptor molecules that link the cargo to the phagophore (the precursor of the autophagosome) membrane via their simultaneous interaction with the cargo and Atg8 proteins on the membrane. Atg8 proteins are attached to membrane in a conjugation reaction and the cargo receptors bind them via short peptide motifs called Atg8-interacting motifs/LC3-interacting regions (AIMs/LIRs). We have recently shown for the yeast Atg19 cargo receptor that the AIM/LIR motifs also serve to recruit the Atg12-Atg5-Atg16 complex, which stimulates Atg8 conjugation, to the cargo. We could further show in a reconstituted system that the recruitment of the Atg12-Atg5-Atg16 complex is sufficient for cargo-directed Atg8 conjugation. Our results suggest that AIM/LIR motifs could have more general roles in autophagy.
9-Cis-Retinoic Acid Induces Growth Inhibition in Retinoid-Sensitive Breast Cancer and Sea Urchin Embryonic Cells via Retinoid X Receptor α and Replication Factor C3

PubMed Central

Maeng, Sejung; Kim, Gil Jung; Choi, Eun Ju; Yang, Hyun Ok; Lee, Dong-Sup

2012-01-01

There is widespread interest in defining factors and mechanisms that suppress the proliferation of cancer cells. Retinoic acid (RA) is a potent suppressor of mammary cancer and developmental embryonic cell proliferation. However, the molecular mechanisms by which 9-cis-RA signaling induces growth inhibition in RA-sensitive breast cancer and embryonic cells are not apparent. Here, we provide evidence that the inhibitory effect of 9-cis-RA on cell proliferation depends on 9-cis-RA-dependent interaction of retinoid X receptor α (RXRα) with replication factor C3 (RFC3), which is a subunit of the RFC heteropentamer that opens and closes the circular proliferating cell nuclear antigen (PCNA) clamp on DNA. An RFC3 ortholog in a sea urchin cDNA library was isolated by using the ligand-binding domain of RXRα as bait in a yeast two-hybrid screening. The interaction of RFC3 with RXRα depends on 9-cis-RA and bexarotene, but not on all-trans-RA or an RA receptor (RAR)-selective ligand. Truncation and mutagenesis experiments demonstrated that the C-terminal LXXLL motifs in both human and sea urchin RFC3 are critical for the interaction with RXRα. The transient interaction between 9-cis-RA-activated RXRα and RFC3 resulted in reconfiguration of the PCNA-RFC complex. Furthermore, we found that knockdown of RXRα or overexpression of RFC3 impairs the ability of 9-cis-RA to inhibit proliferation of MCF-7 breast cancer cells and sea urchin embryogenesis. Our results indicate that 9-cis-RA-activated RXRα suppresses the growth of RA-sensitive breast cancer and embryonic cells through RFC3. PMID:22949521
Trans-acting translational regulatory RNA binding proteins.

PubMed

Harvey, Robert F; Smith, Tom S; Mulroney, Thomas; Queiroz, Rayner M L; Pizzinga, Mariavittoria; Dezi, Veronica; Villenueva, Eneko; Ramakrishna, Manasa; Lilley, Kathryn S; Willis, Anne E

2018-05-01

The canonical molecular machinery required for global mRNA translation and its control has been well defined, with distinct sets of proteins involved in the processes of translation initiation, elongation and termination. Additionally, noncanonical, trans-acting regulatory RNA-binding proteins (RBPs) are necessary to provide mRNA-specific translation, and these interact with 5' and 3' untranslated regions and coding regions of mRNA to regulate ribosome recruitment and transit. Recently it has also been demonstrated that trans-acting ribosomal proteins direct the translation of specific mRNAs. Importantly, it has been shown that subsets of RBPs often work in concert, forming distinct regulatory complexes upon different cellular perturbation, creating an RBP combinatorial code, which through the translation of specific subsets of mRNAs, dictate cell fate. With the development of new methodologies, a plethora of novel RNA binding proteins have recently been identified, although the function of many of these proteins within mRNA translation is unknown. In this review we will discuss these methodologies and their shortcomings when applied to the study of translation, which need to be addressed to enable a better understanding of trans-acting translational regulatory proteins. Moreover, we discuss the protein domains that are responsible for RNA binding as well as the RNA motifs to which they bind, and the role of trans-acting ribosomal proteins in directing the translation of specific mRNAs. This article is categorized under: RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes Translation > Translation Regulation Translation > Translation Mechanisms. © 2018 Medical Research Council and University of Cambridge. WIREs RNA published by Wiley Periodicals, Inc.
Development of five digits is controlled by a bipartite long-range cis-regulator.

PubMed

Lettice, Laura A; Williamson, Iain; Devenney, Paul S; Kilanowski, Fiona; Dorin, Julia; Hill, Robert E

2014-04-01

Conservation within intergenic DNA often highlights regulatory elements that control gene expression from a long range. How conservation within a single element relates to regulatory information and how internal composition relates to function is unknown. Here, we examine the structural features of the highly conserved ZRS (also called MFCS1) cis-regulator responsible for the spatiotemporal control of Shh in the limb bud. By systematically dissecting the ZRS, both in transgenic assays and within in the endogenous locus, we show that the ZRS is, in effect, composed of two distinct domains of activity: one domain directs spatiotemporal activity but functions predominantly from a short range, whereas a second domain is required to promote long-range activity. We show further that these two domains encode activities that are highly integrated and that the second domain is crucial in promoting the chromosomal conformational changes correlated with gene activity. During limb bud development, these activities encoded by the ZRS are interpreted differently by the fore limbs and the hind limbs; in the absence of the second domain there is no Shh activity in the fore limb, and in the hind limb low levels of Shh lead to a variant digit pattern ranging from two to four digits. Hence, in the embryo, the second domain stabilises the developmental programme providing a buffer for SHH morphogen activity and this ensures that five digits form in both sets of limbs.
Development of five digits is controlled by a bipartite long-range cis-regulator

PubMed Central

Lettice, Laura A.; Williamson, Iain; Devenney, Paul S.; Kilanowski, Fiona; Dorin, Julia; Hill, Robert E.

2014-01-01

Conservation within intergenic DNA often highlights regulatory elements that control gene expression from a long range. How conservation within a single element relates to regulatory information and how internal composition relates to function is unknown. Here, we examine the structural features of the highly conserved ZRS (also called MFCS1) cis-regulator responsible for the spatiotemporal control of Shh in the limb bud. By systematically dissecting the ZRS, both in transgenic assays and within in the endogenous locus, we show that the ZRS is, in effect, composed of two distinct domains of activity: one domain directs spatiotemporal activity but functions predominantly from a short range, whereas a second domain is required to promote long-range activity. We show further that these two domains encode activities that are highly integrated and that the second domain is crucial in promoting the chromosomal conformational changes correlated with gene activity. During limb bud development, these activities encoded by the ZRS are interpreted differently by the fore limbs and the hind limbs; in the absence of the second domain there is no Shh activity in the fore limb, and in the hind limb low levels of Shh lead to a variant digit pattern ranging from two to four digits. Hence, in the embryo, the second domain stabilises the developmental programme providing a buffer for SHH morphogen activity and this ensures that five digits form in both sets of limbs. PMID:24715461
Identification of 4-oxo-13-cis-retinoic acid as the major metabolite of 13-cis-retinoic acid in human blood.

PubMed

Vane, F M; Buggé, C J

1981-01-01

The metabolites of 13-cis-retinoic acid (Accutane) were investigated in blood samples from human volunteers on chronic treatment for dermatological disorders. The major metabolite was isolated by reverse-phase high-pressure liquid chromatography and identified as 4-oxo-13-cis-retinoic acid by comparison of its mass and NMR spectra to the spectra of the reference compound. 4-Oxo-all-trans-retinoic acid was also identified, but the extent to which this compound was a metabolite of 13-cis-retinoic acid or an artifactual isomerization product of the major metabolite is unknown. Chromatographic data suggested that small amounts of 13-cis-retinoic acid, 4-hydroxy-13-cis-retinoic acid, and dioxygenated metabolites of 13-cis-retinoic acid may also be present in the blood. This study indicates that a major metabolic pathway of 13-cis-retinoic acid in humans is oxidation at C4 of the cyclohexenyl group.
RSAT: regulatory sequence analysis tools.

PubMed

Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

2008-07-01

The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.
Identity and functions of CxxC-derived motifs.

PubMed

Fomenko, Dmitri E; Gladyshev, Vadim N

2003-09-30

Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.
Unitary circular code motifs in genomes of eukaryotes.

PubMed

El Soufi, Karim; Michel, Christian J

A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified
Concise, stereodivergent and highly stereoselective synthesis of cis- and trans-2-substituted 3-hydroxypiperidines – development of a phosphite-driven cyclodehydration

PubMed Central

Westphal, Julia C

2014-01-01

Summary A concise (5 to 6 steps), stereodivergent, highly diastereoselective (dr up to >19:1 for both stereoisomers) and scalable synthesis (up to 14 g) of cis- and trans-2-substituted 3-piperidinols, a core motif in numerous bioactive compounds, is presented. This sequence allowed an efficient synthesis of the NK-1 inhibitor L-733,060 in 8 steps. Additionally, a cyclodehydration-realizing simple triethylphosphite as a substitute for triphenylphosphine is developed. Here the stoichiometric oxidized P(V)-byproduct (triethylphosphate) is easily removed during the work up through saponification overcoming separation difficulties usually associated to triphenylphosphine oxide. PMID:24605158
A role for circadian evening elements in cold-regulated gene expression in Arabidopsis.

PubMed

Mikkelsen, Michael D; Thomashow, Michael F

2009-10-01

The plant transcriptome is dramatically altered in response to low temperature. The cis-acting DNA regulatory elements and trans-acting factors that regulate the majority of cold-regulated genes are unknown. Previous bioinformatic analysis has indicated that the promoters of cold-induced genes are enriched in the Evening Element (EE), AAAATATCT, a DNA regulatory element that has a role in circadian-regulated gene expression. Here we tested the role of EE and EE-like (EEL) elements in cold-induced expression of two Arabidopsis genes, CONSTANS-like 1 (COL1; At5g54470) and a gene encoding a 27-kDa protein of unknown function that we designated COLD-REGULATED GENE 27 (COR27; At5g42900). Mutational analysis indicated that the EE/EEL elements were required for cold induction of COL1 and COR27, and that their action was amplified through coupling with ABA response element (ABRE)-like (ABREL) motifs. An artificial promoter consisting solely of four EE motifs interspersed with three ABREL motifs was sufficient to impart cold-induced gene expression. Both COL1 and COR27 were found to be regulated by the circadian clock at warm growth temperatures and cold-induction of COR27 was gated by the clock. These results suggest that cold- and clock-regulated gene expression are integrated through regulatory proteins that bind to EE and EEL elements supported by transcription factors acting at ABREL sequences. Bioinformatic analysis indicated that the coupling of EE and EEL motifs with ABREL motifs is highly enriched in cold-induced genes and thus may constitute a DNA regulatory element pair with a significant role in configuring the low-temperature transcriptome.
Interaction between two cis-acting elements, ABRE and DRE, in ABA-dependent expression of Arabidopsis rd29A gene in response to dehydration and high-salinity stresses.

PubMed

Narusaka, Yoshihiro; Nakashima, Kazuo; Shinwari, Zabta K; Sakuma, Yoh; Furihata, Takashi; Abe, Hiroshi; Narusaka, Mari; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2003-04-01

Many abiotic stress-inducible genes contain two cis-acting elements, namely a dehydration-responsive element (DRE; TACCGACAT) and an ABA-responsive element (ABRE; ACGTGG/TC), in their promoter regions. We precisely analyzed the 120 bp promoter region (-174 to -55) of the Arabidopsis rd29A gene whose expression is induced by dehydration, high-salinity, low-temperature, and abscisic acid (ABA) treatments and whose 120 bp promoter region contains the DRE, DRE/CRT-core motif (A/GCCGAC), and ABRE sequences. Deletion and base substitution analyses of this region showed that the DRE-core motif functions as DRE and that the DRE/DRE-core motif could be a coupling element of ABRE. Gel mobility shift assays revealed that DRE-binding proteins (DREB1s/CBFs and DREB2s) bind to both DRE and the DRE-core motif and that ABRE-binding proteins (AREBs/ABFs) bind to ABRE in the 120 bp promoter region. In addition, transactivation experiments using Arabidopsis leaf protoplasts showed that DREBs and AREBs cumulatively transactivate the expression of a GUS reporter gene fused to the 120 bp promoter region of rd29A. These results indicate that DRE and ABRE are interdependent in the ABA-responsive expression of the rd29A gene in response to ABA in Arabidopsis.
Persistence of an Oncogenic Papillomavirus Genome Requires cis Elements from the Viral Transcriptional Enhancer

PubMed Central

Van Doorslaer, Koenraad; Chen, Dan; Chapman, Sandra; Khan, Jameela

2017-01-01

ABSTRACT Human papillomavirus (HPV) genomes are replicated and maintained as extrachromosomal plasmids during persistent infection. The viral E2 proteins are thought to promote stable maintenance replication by tethering the viral DNA to host chromatin. However, this has been very difficult to prove genetically, as the E2 protein is involved in transcriptional regulation and initiation of replication, as well as its assumed role in genome maintenance. This makes mutational analysis of viral trans factors and cis elements in the background of the viral genome problematic and difficult to interpret. To circumvent this problem, we have developed a complementation assay in which the complete wild-type HPV18 genome is transfected into primary human keratinocytes along with subgenomic or mutated replicons that contain the minimal replication origin. The wild-type genome provides the E1 and E2 proteins in trans, allowing us to determine additional cis elements that are required for long-term replication and partitioning of the replicon. We found that, in addition to the core replication origin (and the three E2 binding sites located therein), additional sequences from the transcriptional enhancer portion of the URR (upstream regulatory region) are required in cis for long-term genome replication. PMID:29162712
The Molecular Structure of cis-FONO

NASA Technical Reports Server (NTRS)

Lee, Timothy J.; Dateo, Christopher E.; Rice, Julia E.; Langhoff, Stephen R. (Technical Monitor)

1994-01-01

The molecular structure of cis-FONO has been determined with the CCSD(T) correlation method using an spdf quality basis set. In agreement with previous coupled-cluster calculations but in disagreement with density functional theory, cis-FONO is found to exhibit normal bond distances. The quadratic and cubic force fields of cis-FONO have also been determined in order to evaluate the effect of vibrational averaging on the molecular geometry. Vibrational averaging is found to increase bond distances, as expected, but it does not affect the qualitative nature of the bonding. The CCSD(T)/spdf harmonic frequencies of cis-FONO support our previous assertion that a band observed at 1200 /cm is a combination band (upsilon(sub 3) + upsilon(sub 4)), and not a fundamental.
A lettuce (Lactuca sativa) homolog of human Nogo-B receptor interacts with cis-prenyltransferase and is necessary for natural rubber biosynthesis.

PubMed

Qu, Yang; Chakrabarty, Romit; Tran, Hue T; Kwon, Eun-Joo G; Kwon, Moonhyuk; Nguyen, Trinh-Don; Ro, Dae-Kyun

2015-01-23

Natural rubber (cis-1,4-polyisoprene) is an indispensable biopolymer used to manufacture diverse consumer products. Although a major source of natural rubber is the rubber tree (Hevea brasiliensis), lettuce (Lactuca sativa) is also known to synthesize natural rubber. Here, we report that an unusual cis-prenyltransferase-like 2 (CPTL2) that lacks the conserved motifs of conventional cis-prenyltransferase is required for natural rubber biosynthesis in lettuce. CPTL2, identified from the lettuce rubber particle proteome, displays homology to a human NogoB receptor and is predominantly expressed in latex. Multiple transgenic lettuces expressing CPTL2-RNAi constructs showed that a decrease of CPTL2 transcripts (3-15% CPTL2 expression relative to controls) coincided with the reduction of natural rubber as low as 5%. We also identified a conventional cis-prenyltransferase 3 (CPT3), exclusively expressed in latex. In subcellular localization studies using fluorescent proteins, cytosolic CPT3 was relocalized to endoplasmic reticulum by co-occurrence of CPTL2 in tobacco and yeast at the log phase. Furthermore, yeast two-hybrid data showed that CPTL2 and CPT3 interact. Yeast microsomes containing CPTL2/CPT3 showed enhanced synthesis of short cis-polyisoprenes, but natural rubber could not be synthesized in vitro. Intriguingly, a homologous pair CPTL1/CPT1, which displays ubiquitous expressions in lettuce, showed a potent dolichol biosynthetic activity in vitro. Taken together, our data suggest that CPTL2 is a scaffolding protein that tethers CPT3 on endoplasmic reticulum and is necessary for natural rubber biosynthesis in planta, but yeast-expressed CPTL2 and CPT3 alone could not synthesize high molecular weight natural rubber in vitro. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Regulatory divergence between parental alleles determines gene expression patterns in hybrids.

PubMed

Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

2015-03-29

Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
HOXB9 induction of mesenchymal-to-epithelial transition in gastric carcinoma is negatively regulated by its hexapeptide motif

PubMed Central

He, Changyu; Zhang, Baogui; Zhang, Jun; Liu, Bingya; Zeng, Naiyan; Zhu, Zhenggang

2015-01-01

HOXB9, a transcription factor, plays an important role in development. While HOXB9 has been implicated in tumorigenesis and metastasis, its mechanisms are variable and its role in gastric carcinoma (GC) remains unclear. In the present study, we demonstrated that the expression of HOXB9 decreased in gastric carcinoma and was associated with malignancy and metastasis. Re-expression of HOXB9 in gastric cell lines resulted in the suppression of cell proliferation, migration, and invasion, which was accompanied by the induction of mesenchymal-to-epithelial transition (MET). Comparative sequence analysis and examination of a HOXB9 structural model indicated that three sites might possibly be involved in MET regulation. The in vitro study of HOXB9 mutants showed that these were unable to inhibit MET induction. However, when overexpressing a HOXB9 mutant lacking the hexapeptide motif, a more potent MET induction and tumor suppression was observed compared to that of the wild-type, indicating that the presence of the hexapeptide motif reduced HOXB9 MET induction and tumor suppression activity. Therefore, the results of the present study suggested that HOXB9 is a tumor suppressor in gastric carcinoma, and its activity was controlled by different regulatory mechanisms such as the hexapeptide motif as a “brake” in this case. The results of these regulatory effects could lead to either oncogenic or tumor suppressive roles of HOXB9, depending on the context of the particular type of cancer involved. PMID:26536658

CisLunar Habitat Internal Architecture Design Criteria

NASA Technical Reports Server (NTRS)

Jones, R.; Kennedy, K.; Howard, R.; Whitmore, M.; Martin, C.; Garate, J.

2017-01-01

BACKGROUND: In preparation for human exploration to Mars, there is a need to define the development and test program that will validate deep space operations and systems. In that context, a Proving Grounds CisLunar habitat spacecraft is being defined as the next step towards this goal. This spacecraft will operate differently from the ISS or other spacecraft in human history. The performance envelope of this spacecraft (mass, volume, power, specifications, etc.) is being defined by the Future Capabilities Study Team. This team has recognized the need for a human-centered approach for the internal architecture of this spacecraft and has commissioned a CisLunar Phase-1 Habitat Internal Architecture Study Team to develop a NASA reference configuration, providing the Agency with a "smart buyer" approach for future acquisition. THE CISLUNAR HABITAT INTERNAL ARCHITECTURE STUDY: Overall, the CisLunar Habitat Internal Architecture study will address the most significant questions and risks in the current CisLunar architecture, habitation, and operations concept development. This effort is achieved through definition of design criteria, evaluation criteria and process, design of the CisLunar Habitat Phase-1 internal architecture, and the development and fabrication of internal architecture concepts combined with rigorous and methodical Human-in-the-Loop (HITL) evaluations and testing of the conceptual innovations in a controlled test environment. The vision of the CisLunar Habitat Internal Architecture Study is to design, build, and test a CisLunar Phase-1 Habitat Internal Architecture that will be used for habitation (e.g. habitability and human factors) evaluations. The evaluations will mature CisLunar habitat evaluation tools, guidelines, and standards, and will interface with other projects such as the Advanced Exploration Systems (AES) Program integrated Power, Avionics, Software (iPAS), and Logistics for integrated human-in-the-loop testing. The mission of the Cis
SLAM-seq defines direct gene-regulatory functions of the BRD4-MYC axis.

PubMed

Muhar, Matthias; Ebert, Anja; Neumann, Tobias; Umkehrer, Christian; Jude, Julian; Wieshofer, Corinna; Rescheneder, Philipp; Lipp, Jesse J; Herzog, Veronika A; Reichholf, Brian; Cisneros, David A; Hoffmann, Thomas; Schlapansky, Moritz F; Bhat, Pooja; von Haeseler, Arndt; Köcher, Thomas; Obenauf, Anna C; Popow, Johannes; Ameres, Stefan L; Zuber, Johannes

2018-05-18

Defining direct targets of transcription factors and regulatory pathways is key to understanding their roles in physiology and disease. We combined SLAM-seq [thiol(SH)-linked alkylation for the metabolic sequencing of RNA], a method for direct quantification of newly synthesized messenger RNAs (mRNAs), with pharmacological and chemical-genetic perturbation in order to define regulatory functions of two transcriptional hubs in cancer, BRD4 and MYC, and to interrogate direct responses to BET bromodomain inhibitors (BETis). We found that BRD4 acts as general coactivator of RNA polymerase II-dependent transcription, which is broadly repressed upon high-dose BETi treatment. At doses triggering selective effects in leukemia, BETis deregulate a small set of hypersensitive targets including MYC. In contrast to BRD4, MYC primarily acts as a selective transcriptional activator controlling metabolic processes such as ribosome biogenesis and de novo purine synthesis. Our study establishes a simple and scalable strategy to identify direct transcriptional targets of any gene or pathway. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
A coherent transcriptional feed-forward motif model for mediating auxin-sensitive PIN3 expression during lateral root development

PubMed Central

Chen, Qian; Liu, Yang; Maere, Steven; Lee, Eunkyoung; Van Isterdael, Gert; Xie, Zidian; Xuan, Wei; Lucas, Jessica; Vassileva, Valya; Kitakura, Saeko; Marhavý, Peter; Wabnik, Krzysztof; Geldner, Niko; Benková, Eva; Le, Jie; Fukaki, Hidehiro; Grotewold, Erich; Li, Chuanyou; Friml, Jiří; Sack, Fred; Beeckman, Tom; Vanneste, Steffen

2015-01-01

Multiple plant developmental processes, such as lateral root development, depend on auxin distribution patterns that are in part generated by the PIN-formed family of auxin-efflux transporters. Here we propose that AUXIN RESPONSE FACTOR7 (ARF7) and the ARF7-regulated FOUR LIPS/MYB124 (FLP) transcription factors jointly form a coherent feed-forward motif that mediates the auxin-responsive PIN3 transcription in planta to steer the early steps of lateral root formation. This regulatory mechanism might endow the PIN3 circuitry with a temporal ‘memory' of auxin stimuli, potentially maintaining and enhancing the robustness of the auxin flux directionality during lateral root development. The cooperative action between canonical auxin signalling and other transcription factors might constitute a general mechanism by which transcriptional auxin-sensitivity can be regulated at a tissue-specific level. PMID:26578065
Emerging principles of regulatory evolution.

PubMed

Prud'homme, Benjamin; Gompel, Nicolas; Carroll, Sean B

2007-05-15

Understanding the genetic and molecular mechanisms governing the evolution of morphology is a major challenge in biology. Because most animals share a conserved repertoire of body-building and -patterning genes, morphological diversity appears to evolve primarily through changes in the deployment of these genes during development. The complex expression patterns of developmentally regulated genes are typically controlled by numerous independent cis-regulatory elements (CREs). It has been proposed that morphological evolution relies predominantly on changes in the architecture of gene regulatory networks and in particular on functional changes within CREs. Here, we discuss recent experimental studies that support this hypothesis and reveal some unanticipated features of how regulatory evolution occurs. From this growing body of evidence, we identify three key operating principles underlying regulatory evolution, that is, how regulatory evolution: (i) uses available genetic components in the form of preexisting and active transcription factors and CREs to generate novelty; (ii) minimizes the penalty to overall fitness by introducing discrete changes in gene expression; and (iii) allows interactions to arise among any transcription factor and downstream CRE. These principles endow regulatory evolution with a vast creative potential that accounts for both relatively modest morphological differences among closely related species and more profound anatomical divergences among groups at higher taxonomical levels.
Quaking and PTB control overlapping splicing regulatory networks during muscle cell differentiation

PubMed Central

Hall, Megan P.; Nagel, Roland J.; Fagg, W. Samuel; Shiue, Lily; Cline, Melissa S.; Perriman, Rhonda J.; Donohue, John Paul; Ares, Manuel

2013-01-01

Alternative splicing contributes to muscle development, but a complete set of muscle-splicing factors and their combinatorial interactions are unknown. Previous work identified ACUAA (“STAR” motif) as an enriched intron sequence near muscle-specific alternative exons such as Capzb exon 9. Mass spectrometry of myoblast proteins selected by the Capzb exon 9 intron via RNA affinity chromatography identifies Quaking (QK), a protein known to regulate mRNA function through ACUAA motifs in 3′ UTRs. We find that QK promotes inclusion of Capzb exon 9 in opposition to repression by polypyrimidine tract-binding protein (PTB). QK depletion alters inclusion of 406 cassette exons whose adjacent intron sequences are also enriched in ACUAA motifs. During differentiation of myoblasts to myotubes, QK levels increase two- to threefold, suggesting a mechanism for QK-responsive exon regulation. Combined analysis of the PTB- and QK-splicing regulatory networks during myogenesis suggests that 39% of regulated exons are under the control of one or both of these splicing factors. This work provides the first evidence that QK is a global regulator of splicing during muscle development in vertebrates and shows how overlapping splicing regulatory networks contribute to gene expression programs during differentiation. PMID:23525800
Biosynthesis of adipic acid via microaerobic hydrogenation of cis,cis-muconic acid by oxygen-sensitive enoate reductase.

PubMed

Sun, Jing; Raza, Muslim; Sun, Xinxiao; Yuan, Qipeng

2018-06-06

Adipic acid (AA) is an important dicarboxylic acid used for the manufacture of nylon and polyurethane plastics. In this study, a novel adipic acid biosynthetic pathway was designed by extending the cis,cis-muconic acid (MA) biosynthesis through biohydrogenation. Enoate reductase from Clostridium acetobutylicum (CaER), an oxygen-sensitive reductase, was demonstrated to have in vivo enzyme activity of converting cis,cis-muconic acid to adipic acid under microaerobic condition. Engineered Escherichia coli strains were constructed to express the whole pathway and accumulated 5.8 ± 0.9 mg/L adipic acid from simple carbon sources. Considering the different oxygen demands between cis,cis-muconic acid biosynthesis and its degradation, a co-culture system was constructed. To improve production, T7 promoter instead of lac promoter was used for higher level expression of the key enzyme CaER and the titer of adipic acid increased to 18.3 ± 0.6 mg/L. To decrease the oxygen supply to downstream strains expressing CaER, Vitreoscilla hemoglobin (VHb) was introduced to upstream strains for its ability on oxygen obtaining. This attempt further improved the production of this novel pathway and 27.6 ± 1.3 mg/L adipic acid was accumulated under microaerobic condition. Copyright © 2018. Published by Elsevier B.V.
Identification and quantitation of all-trans- and 13-cis-retinoic acid and 13-cis-4-oxoretinoic acid in human plasma.

PubMed

Eckhoff, C; Nau, H

1990-08-01

Human plasma was analyzed by high performance liquid chromatography for the presence of retinoic acid and 4-oxoretinoic acid isomers. Peaks that coeluted with the reference compounds all-trans-retinoic acid, 13-cis-retinoic acid, and 13-cis-4-oxoretinoic acid were routinely observed in human plasma. These retinoids were unequivocally identified by the following methods: comigration with reference compounds under several high performance liquid chromatographic conditions; comparison of ultraviolet spectra with those of reference compounds; derivatization with diazomethane and coelution of the methyl esters with reference compounds in a high performance liquid chromatographic system as well as in a gas chromatography system with a mass selective detector. In vitro formation of 13-cis-retinoic acid and 13-cis-4-oxoretinoic acid as artifacts during the analytical procedure was excluded by control experiments. The mean plasma concentrations of the vitamin A metabolites in ten male volunteers were: all-trans-retinoic acid: 1.32 +/- 0.46 ng/ml; 13-cis-retinoic acid: 1.63 +/- 0.85 ng/ml; and 13-cis-4-oxoretinoic acid: 3.68 +/- 0.99 ng/ml. After oral dosing with vitamin A (833 IU/kg body weight) in five male volunteers, mean plasma all-trans-retinoic acid increased to 3.92 +/- 1.40 ng/ml and 13-cis-retinoic acid increased to 9.75 +/- 2.18 ng/ml. Maximal plasma 13-cis-4-oxoretinoic acid concentrations (average 7.60 +/- 1.45 ng/ml) were observed 6 h after dosing which was the last time point in this study. Concentrations of all-trans-4-oxoretinoic acid were low or not detectable. Our findings suggest that, in addition to all-trans-retinoic acid, 13-cis-retinoic acid and 13-cis-4-oxoretinoic acid are present in normal human plasma as metabolites of vitamin A.
oPOSSUM: integrated tools for analysis of regulatory motif over-representation

PubMed Central

Ho Sui, Shannan J.; Fulton, Debra L.; Arenillas, David J.; Kwon, Andrew T.; Wasserman, Wyeth W.

2007-01-01

The identification of over-represented transcription factor binding sites from sets of co-expressed genes provides insights into the mechanisms of regulation for diverse biological contexts. oPOSSUM, an internet-based system for such studies of regulation, has been improved and expanded in this new release. New features include a worm-specific version for investigating binding sites conserved between Caenorhabditis elegans and C. briggsae, as well as a yeast-specific version for the analysis of co-expressed sets of Saccharomyces cerevisiae genes. The human and mouse applications feature improvements in ortholog mapping, sequence alignments and the delineation of multiple alternative promoters. oPOSSUM2, introduced for the analysis of over-represented combinations of motifs in human and mouse genes, has been integrated with the original oPOSSUM system. Analysis using user-defined background gene sets is now supported. The transcription factor binding site models have been updated to include new profiles from the JASPAR database. oPOSSUM is available at http://www.cisreg.ca/oPOSSUM/ PMID:17576675
The RCAN carboxyl end mediates calcineurin docking-dependent inhibition via a site that dictates binding to substrates and regulators

PubMed Central

Martínez-Martínez, Sara; Genescà, Lali; Rodríguez, Antonio; Raya, Alicia; Salichs, Eulàlia; Were, Felipe; López-Maderuelo, María Dolores; Redondo, Juan Miguel; de la Luna, Susana

2009-01-01

Specificity of signaling kinases and phosphatases toward their targets is usually mediated by docking interactions with substrates and regulatory proteins. Here, we characterize the motifs involved in the physical and functional interaction of the phosphatase calcineurin with a group of modulators, the RCAN protein family. Mutation of key residues within the hydrophobic docking-cleft of the calcineurin catalytic domain impairs binding to all human RCAN proteins and to the calcineurin interacting proteins Cabin1 and AKAP79. A valine-rich region within the RCAN carboxyl region is essential for binding to the docking site in calcineurin. Although a peptide containing this sequence compromises NFAT signaling in living cells, it does not inhibit calcineurin catalytic activity directly. Instead, calcineurin catalytic activity is inhibited by a motif at the extreme C-terminal region of RCAN, which acts in cis with the docking motif. Our results therefore indicate that the inhibitory action of RCAN on calcineurin-NFAT signaling results not only from the inhibition of phosphatase activity but also from competition between NFAT and RCAN for binding to the same docking site in calcineurin. Thus, competition by substrates and modulators for a common docking site appears to be an essential mechanism in the regulation of Ca2+-calcineurin signaling. PMID:19332797
Structure of H/ACA RNP protein Nhp2p reveals cis/trans isomerization of a conserved proline at the RNA and Nop10 binding interface

PubMed Central

Koo, Bon-Kyung; Park, Chin-Ju; Fernandez, Cesar F.; Chim, Nicholas; Ding, Yi; Chanfreau, Guillaume; Feigon, Juli

2011-01-01

H/ACA small nucleolar and Cajal body ribonucleoproteins (RNPs) function in site-specific pseudouridylation of eukaryotic rRNA and snRNA, rRNA processing, and vertebrate telomerase biogenesis. Nhp2, one of four essential protein components of eukaryotic H/ACA RNPs, forms a core trimer with the pseudouridylase Cbf5 and Nop10 that specifically binds to H/ACA RNAs. Crystal structures of archaeal H/ACA RNPs have revealed how the protein components interact with each other and with the H/ACA RNA. However, in place of Nhp2p, archaeal H/ACA RNPs contain L7Ae, which binds specifically to an RNA K-loop motif absent in eukaryotic H/ACA RNPs, while Nhp2 binds a broader range of RNA structures. We report solution NMR studies of S. cerevisiae Nhp2 (Nhp2p), which reveal that Nhp2p exhibits two major conformations in solution due to cis/trans isomerization of the evolutionarily conserved Pro83. The equivalent proline is in the cis conformation in all reported structures of L7Ae and other homologous proteins. Nhp2p has the expected α-β-α fold, but the solution structures of the major conformation of Nhp2p with trans Pro83 and of Nhp2p-S82W with cis Pro83 reveal that Pro83 cis/trans isomerization affects the positions of numerous residues at the Nop10- and RNA-binding interface. An S82W substitution, which stabilizes the cis conformation, also stabilizes the association of Nhp2p with H/ACA snoRNPs in vivo. We propose that Pro83 plays a key role in the assembly of the eukaryotic H/ACA RNP, with the cis conformation locking in a stable Cbf5-Nop10-Nhp2 ternary complex and positioning the protein backbone to interact with the H/ACA RNA. PMID:21708174
Role of the Box C/D Motif in Localization of Small Nucleolar RNAs to Coiled Bodies and Nucleoli

PubMed Central

Narayanan, Aarthi; Speckmann, Wayne; Terns, Rebecca; Terns, Michael P.

1999-01-01

Small nucleolar RNAs (snoRNAs) are a large family of eukaryotic RNAs that function within the nucleolus in the biogenesis of ribosomes. One major class of snoRNAs is the box C/D snoRNAs named for their conserved box C and box D sequence elements. We have investigated the involvement of cis-acting sequences and intranuclear structures in the localization of box C/D snoRNAs to the nucleolus by assaying the intranuclear distribution of fluorescently labeled U3, U8, and U14 snoRNAs injected into Xenopus oocyte nuclei. Analysis of an extensive panel of U3 RNA variants showed that the box C/D motif, comprised of box C′, box D, and the 3′ terminal stem of U3, is necessary and sufficient for the nucleolar localization of U3 snoRNA. Disruption of the elements of the box C/D motif of U8 and U14 snoRNAs also prevented nucleolar localization, indicating that all box C/D snoRNAs use a common nucleolar-targeting mechanism. Finally, we found that wild-type box C/D snoRNAs transiently associate with coiled bodies before they localize to nucleoli and that variant RNAs that lack an intact box C/D motif are detained within coiled bodies. These results suggest that coiled bodies play a role in the biogenesis and/or intranuclear transport of box C/D snoRNAs. PMID:10397754
Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison.

PubMed

Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S; Sinha, Saurabh

2011-12-01

Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, 'enhancers'), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for 'motif-blind' CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to 'supervise' the search. We propose a new statistical method, based on 'Interpolated Markov Models', for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. © The Author(s) 2011. Published by Oxford University Press.
Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

PubMed Central

Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S.; Sinha, Saurabh

2011-01-01

Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. PMID:21821659
Regulatory elements of Caenorhabditis elegans ribosomal protein genes

PubMed Central

2012-01-01

Background Ribosomal protein genes (RPGs) are essential, tightly regulated, and highly expressed during embryonic development and cell growth. Even though their protein sequences are strongly conserved, their mechanism of regulation is not conserved across yeast, Drosophila, and vertebrates. A recent investigation of genomic sequences conserved across both nematode species and associated with different gene groups indicated the existence of several elements in the upstream regions of C. elegans RPGs, providing a new insight regarding the regulation of these genes in C. elegans. Results In this study, we performed an in-depth examination of C. elegans RPG regulation and found nine highly conserved motifs in the upstream regions of C. elegans RPGs using the motif discovery algorithm DME. Four motifs were partially similar to transcription factor binding sites from C. elegans, Drosophila, yeast, and human. One pair of these motifs was found to co-occur in the upstream regions of 250 transcripts including 22 RPGs. The distance between the two motifs displayed a complex frequency pattern that was related to their relative orientation. We tested the impact of three of these motifs on the expression of rpl-2 using a series of reporter gene constructs and showed that all three motifs are necessary to maintain the high natural expression level of this gene. One of the motifs was similar to the binding site of an orthologue of POP-1, and we showed that RNAi knockdown of pop-1 impacts the expression of rpl-2. We further determined the transcription start site of rpl-2 by 5’ RACE and found that the motifs lie 40–90 bases upstream of the start site. We also found evidence that a noncoding RNA, contained within the outron of rpl-2, is co-transcribed with rpl-2 and cleaved during trans-splicing. Conclusions Our results indicate that C. elegans RPGs are regulated by a complex novel series of regulatory elements that is evolutionarily distinct from those of all other species
RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse

PubMed Central

Liu, Zhi-Ping; Wu, Canglin; Miao, Hongyu; Wu, Hulin

2015-01-01

Transcriptional and post-transcriptional regulation of gene expression is of fundamental importance to numerous biological processes. Nowadays, an increasing amount of gene regulatory relationships have been documented in various databases and literature. However, to more efficiently exploit such knowledge for biomedical research and applications, it is necessary to construct a genome-wide regulatory network database to integrate the information on gene regulatory relationships that are widely scattered in many different places. Therefore, in this work, we build a knowledge-based database, named ‘RegNetwork’, of gene regulatory networks for human and mouse by collecting and integrating the documented regulatory interactions among transcription factors (TFs), microRNAs (miRNAs) and target genes from 25 selected databases. Moreover, we also inferred and incorporated potential regulatory relationships based on transcription factor binding site (TFBS) motifs into RegNetwork. As a result, RegNetwork contains a comprehensive set of experimentally observed or predicted transcriptional and post-transcriptional regulatory relationships, and the database framework is flexibly designed for potential extensions to include gene regulatory networks for other organisms in the future. Based on RegNetwork, we characterized the statistical and topological properties of genome-wide regulatory networks for human and mouse, we also extracted and interpreted simple yet important network motifs that involve the interplays between TF-miRNA and their targets. In summary, RegNetwork provides an integrated resource on the prior information for gene regulatory relationships, and it enables us to further investigate context-specific transcriptional and post-transcriptional regulatory interactions based on domain-specific experimental data. Database URL: http://www.regnetworkweb.org PMID:26424082
Discovery of phosphorylation motif mixtures in phosphoproteomics data

PubMed Central

Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.

2009-01-01

Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944
Helix-packing motifs in membrane proteins.

PubMed

Walters, R F S; DeGrado, W F

2006-09-12

The fold of a helical membrane protein is largely determined by interactions between membrane-imbedded helices. To elucidate recurring helix-helix interaction motifs, we dissected the crystallographic structures of membrane proteins into a library of interacting helical pairs. The pairs were clustered according to their three-dimensional similarity (rmsd motifs whose structural features can be understood in terms of simple principles of helix-helix packing. Thus, the universe of common transmembrane helix-pairing motifs is relatively simple. The largest cluster, which comprises 29% of the library members, consists of an antiparallel motif with left-handed packing angles, and it is frequently stabilized by packing of small side chains occurring every seven residues in the sequence. Right-handed parallel and antiparallel structures show a similar tendency to segregate small residues to the helix-helix interface but spaced at four-residue intervals. Position-specific sequence propensities were derived for the most populated motifs. These structural and sequential motifs should be quite useful for the design and structural prediction of membrane proteins.
A generic motif discovery algorithm for sequential data.

PubMed

Jensen, Kyle L; Styczynski, Mark P; Rigoutsos, Isidore; Stephanopoulos, Gregory N

2006-01-01

Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems. Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. Gemoda is freely available at http://web.mit.edu/bamel/gemoda
PDSM, a motif for phosphorylation-dependent SUMO modification

PubMed Central

Hietakangas, Ville; Anckar, Julius; Blomster, Henri A.; Fujimoto, Mitsuaki; Palvimo, Jorma J.; Nakai, Akira; Sistonen, Lea

2006-01-01

SUMO (small ubiquitin-like modifier) modification regulates many cellular processes, including transcription. Although sumoylation often occurs on specific lysines within the consensus tetrapeptide ΨKxE, other modifications, such as phosphorylation, may regulate the sumoylation of a substrate. We have discovered PDSM (phosphorylation-dependent sumoylation motif), composed of a SUMO consensus site and an adjacent proline-directed phosphorylation site (ΨKxExxSP). The highly conserved motif regulates phosphorylation-dependent sumoylation of multiple substrates, such as heat-shock factors (HSFs), GATA-1, and myocyte enhancer factor 2. In fact, the majority of the PDSM-containing proteins are transcriptional regulators. Within the HSF family, PDSM is conserved between two functionally distinct members, HSF1 and HSF4b, whose transactivation capacities are repressed through the phosphorylation-dependent sumoylation. As the first recurrent sumoylation determinant beyond the consensus tetrapeptide, the PDSM provides a valuable tool in predicting new SUMO substrates. PMID:16371476
The role of 11-cis-retinyl esters in vertebrate cone vision.

PubMed

Babino, Darwin; Perkins, Brian D; Kindermann, Aljoscha; Oberhauser, Vitus; von Lintig, Johannes

2015-01-01

A cycle of cis-to-trans isomerization of the chromophore is intrinsic to vertebrate vision where rod and cone photoreceptors mediate dim- and bright-light vision, respectively. Daylight illumination can greatly exceed the rate at which the photoproduct can be recycled back to the chromophore by the canonical visual cycle. Thus, an additional supply pathway(s) must exist to sustain cone-dependent vision. Two-photon microscopy revealed that the eyes of the zebrafish (Danio rerio) contain high levels of 11-cis-retinyl esters (11-REs) within the retinal pigment epithelium. HPLC analyses demonstrate that 11-REs are bleached by bright light and regenerated in the dark. Pharmacologic treatment with all-trans-retinylamine (Ret-NH2), a potent and specific inhibitor of the trans-to-cis reisomerization reaction of the canonical visual cycle, impeded the regeneration of 11-REs. Intervention with 11-cis-retinol restored the regeneration of 11-REs in the presence of all-trans-Ret-NH2. We used the XOPS:mCFP transgenic zebrafish line with a functional cone-only retina to directly demonstrate that this 11-RE cycle is critical to maintain vision under bright-light conditions. Thus, our analyses reveal that a dark-generated pool of 11-REs helps to supply photoreceptors with the chromophore under the varying light conditions present in natural environments. © FASEB.

Deciphering functional glycosaminoglycan motifs in development.

PubMed

Townley, Robert A; Bülow, Hannes E

2018-03-23

Glycosaminoglycans (GAGs) such as heparan sulfate, chondroitin/dermatan sulfate, and keratan sulfate are linear glycans, which when attached to protein backbones form proteoglycans. GAGs are essential components of the extracellular space in metazoans. Extensive modifications of the glycans such as sulfation, deacetylation and epimerization create structural GAG motifs. These motifs regulate protein-protein interactions and are thereby repsonsible for many of the essential functions of GAGs. This review focusses on recent genetic approaches to characterize GAG motifs and their function in defined signaling pathways during development. We discuss a coding approach for GAGs that would enable computational analyses of GAG sequences such as alignments and the computation of position weight matrices to describe GAG motifs. Copyright © 2018 Elsevier Ltd. All rights reserved.
Chemical characterization of organosulfates from the hydroxyl radical-initiated oxidation and ozonolysis of cis-3-hexen-1-ol

NASA Astrophysics Data System (ADS)

Barbosa, Thais S.; Riva, Matthieu; Chen, Yuzhi; da Silva, Cleyton M.; Ameida, Jose Claudino S.; Zhang, Zhenfa; Gold, Avram; Arbilla, Graciela; Bauerfeldt, Glauco F.; Surratt, Jason D.

2017-08-01

Cis-3-hexen-1-ol (cis-HXO) is a green leaf volatile emitted from plants under stress and belongs to an important class of biogenic volatile organic compounds. In this study, we have investigated the potential formation of organosulfates (OSs) from the hydroxyl radical (OH)-initiated oxidation and ozonolysis of cis-HXO using either non-acidified or acidified sulfate seed aerosols under different relative humidity (RH) conditions. For selected ozonolysis experiments, an OH scavenger was utilized. Ultra performance liquid chromatography interfaced to high-resolution quadrupole time-of-flight mass spectrometry with electrospray ionization (UPLC/ESI-HR-Q-TOFMS) was used to characterize cis-HXO-derived secondary organic aerosol (SOA) formation. Chemical characterization of cis-HXO-derived SOA products reveals that OSs were generated in significant quantity from multiphase chemistry of gas-phase oxidation products of cis-HXO. Ambient fine aerosol (PM2.5) samples collected from Rio de Janeiro, Brazil, were also analyzed. Seven cis-HXO-derived OSs identified in the lab study with molecular weights 154, 186, 170, 210, 212, 226 and 270 were also found in the PM2.5 samples collected in Brazil. This study provides direct evidence that the oxidation of cis-HXO by OH and O3 yields biogenic SOA through the formation of polar OSs.
Multi-phase back contacts for CIS solar cells

DOEpatents

Rockett, A.A.; Yang, L.C.

1995-12-19

Multi-phase, single layer, non-interdiffusing M-Mo back contact metallized films, where M is selected from Cu, Ga, or mixtures thereof, for CIS cells are deposited by a sputtering process on suitable substrates, preferably glass or alumina, to prevent delamination of the CIS from the back contact layer. Typical CIS compositions include CuXSe{sub 2} where X is In or/and Ga. The multi-phase mixture is deposited on the substrate in a manner to provide a columnar microstructure, with micro-vein Cu or/and Ga regions which partially or fully vertically penetrate the entire back contact layer. The CIS semiconductor layer is then deposited by hybrid sputtering and evaporation process. The Cu/Ga-Mo deposition is controlled to produce the single layer two-phase columnar morphology with controllable Cu or Ga vein size less than about 0.01 microns in width. During the subsequent deposition of the CIS layer, the columnar Cu/Ga regions within the molybdenum of the Cu/Ga-Mo back layer tend to partially leach out, and are replaced by columns of CIS. Narrower Cu and/or Ga regions, and those with fewer inner connections between regions, leach out more slowly during the subsequent CIS deposition. This gives a good mechanical and electrical interlock of the CIS layer into the Cu/Ga-Mo back layer. Solar cells employing In-rich CIS semiconductors bonded to the multi-phase columnar microstructure back layer of this invention exhibit vastly improved photo-electrical conversion on the order of 17% greater than Mo alone, improved uniformity of output across the face of the cell, and greater Fill Factor. 15 figs.
Multi-phase back contacts for CIS solar cells

DOEpatents

Rockett, Angus A.; Yang, Li-Chung

1995-01-01

Multi-phase, single layer, non-interdiffusing M-Mo back contact metallized films, where M is selected from Cu, Ga, or mixtures thereof, for CIS cells are deposited by a sputtering process on suitable substrates, preferably glass or alumina, to prevent delamination of the CIS from the back contact layer. Typical CIS compositions include CuXSe.sub.2 where X is In or/and Ga. The multi-phase mixture is deposited on the substrate in a manner to provide a columnar microstructure, with micro-vein Cu or/and Ga regions which partially or fully vertically penetrate the entire back contact layer. The CIS semiconductor layer is then deposited by hybrid sputtering and evaporation process. The Cu/Ga-Mo deposition is controlled to produce the single layer two-phase columnar morphology with controllable Cu or Ga vein size less than about 0.01 microns in width. During the subsequent deposition of the CIS layer, the columnar Cu/Ga regions within the molybdenum of the Cu/Ga-Mo back layer tend to partially leach out, and are replaced by columns of CIS. Narrower Cu and/or Ga regions, and those with fewer inner connections between regions, leach out more slowly during the subsequent CIS deposition. This gives a good mechanical and electrical interlock of the CIS layer into the Cu/Ga-Mo back layer. Solar cells employing In-rich CIS semiconductors bonded to the multi-phase columnar microstructure back layer of this invention exhibit vastly improved photo-electrical conversion on the order of 17% greater than Mo alone, improved uniformity of output across the face of the cell, and greater Fill Factor.
Beyond genome-wide scan: Association of a cis-regulatory NCR3 variant with mild malaria in a population living in the Republic of Congo.

PubMed

Baaklini, Sabrina; Afridi, Sarwat; Nguyen, Thy Ngoc; Koukouikila-Koussounda, Felix; Ndounga, Mathieu; Imbert, Jean; Torres, Magali; Pradel, Lydie; Ntoumi, Francine; Rihet, Pascal

2017-01-01

Linkage studies have revealed a linkage of mild malaria to chromosome 6p21 that contains the NCR3 gene encoding a natural killer cell receptor, whereas NCR3-412G>C (rs2736191) located in its promoter region was found to be associated with malaria in Burkina Faso. Here we confirmed the association of rs2736191 with mild malaria in a Congolese cohort and investigated its potential cis-regulatory effect. Luciferase assay results indicated that rs2736191-G allele had a significantly increased promoter activity compared to rs2736191-C allele. Furthermore, EMSAs demonstrated an altered binding of two nuclear protein complexes to the rs2736191-C allele in comparison to rs2736191-G allele. Finally, after in silico identification of transcription factor candidates, pull-down western blot experiments confirmed that both STAT4 and RUNX3 bind the region encompassing rs2736191 with a higher affinity for the G allele. To our knowledge, this is the first report that explored the functional role of rs2736191. These results support the hypothesis that genetic variation within natural killer cell receptors alters malaria resistance in humans.
Characterization of noncoding regulatory DNA in the human genome.

PubMed

Elkon, Ran; Agami, Reuven

2017-08-08

Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.
Effect of C(60) fullerene on the duplex formation of i-motif DNA with complementary DNA in solution.

PubMed

Jin, Kyeong Sik; Shin, Su Ryon; Ahn, Byungcheol; Jin, Sangwoo; Rho, Yecheol; Kim, Heesoo; Kim, Seon Jeong; Ree, Moonhor

2010-04-15

The structural effects of fullerene on i-motif DNA were investigated by characterizing the structures of fullerene-free and fullerene-bound i-motif DNA, in the presence of cDNA and in solutions of varying pH, using circular dichroism and synchrotron small-angle X-ray scattering. To facilitate a direct structural comparison between the i-motif and duplex structures in response to pH stimulus, we developed atomic scale structural models for the duplex and i-motif DNA structures, and for the C(60)/i-motif DNA hybrid associated with the cDNA strand, assuming that the DNA strands are present in an ideal right-handed helical conformation. We found that fullerene shifted the pH-induced conformational transition between the i-motif and the duplex structure, possibly due to the hydrophobic interactions between the terminal fullerenes and between the terminal fullerenes and an internal TAA loop in the DNA strand. The hybrid structure showed a dramatic reduction in cyclic hysteresis.
A private DNA motif finding algorithm.

PubMed

Chen, Rui; Peng, Yun; Choi, Byron; Xu, Jianliang; Hu, Haibo

2014-08-01

With the increasing availability of genomic sequence data, numerous methods have been proposed for finding DNA motifs. The discovery of DNA motifs serves a critical step in many biological applications. However, the privacy implication of DNA analysis is normally neglected in the existing methods. In this work, we propose a private DNA motif finding algorithm in which a DNA owner's privacy is protected by a rigorous privacy model, known as ∊-differential privacy. It provides provable privacy guarantees that are independent of adversaries' background knowledge. Our algorithm makes use of the n-gram model and is optimized for processing large-scale DNA sequences. We evaluate the performance of our algorithm over real-life genomic data and demonstrate the promise of integrating privacy into DNA motif finding. Copyright © 2014 Elsevier Inc. All rights reserved.
Biosynthesis of cis,cis-muconic acid and its aromatic precursors, catechol and protocatechuic acid, from renewable feedstocks by Saccharomyces cerevisiae.

PubMed

Weber, Christian; Brückner, Christine; Weinreb, Sheila; Lehr, Claudia; Essl, Christine; Boles, Eckhard

2012-12-01

Adipic acid is a high-value compound used primarily as a precursor for the synthesis of nylon, coatings, and plastics. Today it is produced mainly in chemical processes from petrochemicals like benzene. Because of the strong environmental impact of the production processes and the dependence on fossil resources, biotechnological production processes would provide an interesting alternative. Here we describe the first engineered Saccharomyces cerevisiae strain expressing a heterologous biosynthetic pathway converting the intermediate 3-dehydroshikimate of the aromatic amino acid biosynthesis pathway via protocatechuic acid and catechol into cis,cis-muconic acid, which can be chemically dehydrogenated to adipic acid. The pathway consists of three heterologous microbial enzymes, 3-dehydroshikimate dehydratase, protocatechuic acid decarboxylase composed of three different subunits, and catechol 1,2-dioxygenase. For each heterologous reaction step, we analyzed several potential candidates for their expression and activity in yeast to compose a functional cis,cis-muconic acid synthesis pathway. Carbon flow into the heterologous pathway was optimized by increasing the flux through selected steps of the common aromatic amino acid biosynthesis pathway and by blocking the conversion of 3-dehydroshikimate into shikimate. The recombinant yeast cells finally produced about 1.56 mg/liter cis,cis-muconic acid.
Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals

PubMed Central

2014-01-01

Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution. PMID:25052519
Cis-regulation of the amphioxus engrailed gene: insights into evolution of a muscle-specific enhancer.

PubMed

Beaster-Jones, Laura; Schubert, Michael; Holland, Linda Z

2007-08-01

To gain insights into the relation between evolution of cis-regulatory DNA and evolution of gene function, we identified tissue-specific enhancers of the engrailed gene of the basal chordate amphioxus (Branchiostoma floridae) and compared their ability to direct expression in both amphioxus and its nearest chordate relative, the tunicate Ciona intestinalis. In amphioxus embryos, the native engrailed gene is expressed in three domains - the eight most anterior somites, a few cells in the central nervous system (CNS) and a few ectodermal cells. In contrast, in C. intestinalis, in which muscle development is highly divergent, engrailed expression is limited to the CNS. To characterize the tissue-specific enhancers of amphioxus engrailed, we first showed that 7.8kb of upstream DNA of amphioxus engrailed directs expression to all three domains in amphioxus that express the native gene. We then identified the amphioxus engrailed muscle-specific enhancer as the 1.2kb region of upstream DNA with the highest sequence identity to the mouse en-2 jaw muscle enhancer. This amphioxus enhancer directed expression to both the somites in amphioxus and to the larval muscles in C. intestinalis. These results show that even though expression of the native engrailed has apparently been lost in developing C. intestinalis muscles, they express the transcription factors necessary to activate transcription from the amphioxus engrailed enhancer, suggesting that gene networks may not be completely disrupted if an individual component is lost.
Effects of delay and noise in a negative feedback regulatory motif

NASA Astrophysics Data System (ADS)

Palassini, Matteo; Dies, Marta

2009-03-01

The small copy number of the molecules involved in gene regulation can induce nontrivial stochastic phenomena such as noise-induced oscillations. An often neglected aspect of regulation dynamics are the delays involved in transcription and translation. Delays introduce analytical and computational complications because the dynamics is non-Markovian. We study the interplay of noise and delays in a negative feedback model of the p53 core regulatory network. Recent experiments have found pronounced oscillations in the concentrations of proteins p53 and Mdm2 in individual cells subjected to DNA damage. Similar oscillations occur in the Hes-1 and NK-kB systems, and in circadian rhythms. Several mechanisms have been proposed to explain this oscillatory behaviour, such as deterministic limit cycles, with and without delay, or noise-induced excursions in excitable models. We consider a generic delayed Master Equation incorporating the activation of Mdm2 by p53 and the Mdm2-promoted degradation of p53. In the deterministic limit and for large delays, the model shows a Hopf bifurcation. Via exact stochastic simulations, we find strong noise-induced oscillations well outside the limit-cycle region. We propose that this may be a generic mechanism for oscillations in gene regulatory systems.
Sequential visibility-graph motifs

NASA Astrophysics Data System (ADS)

Iacovacci, Jacopo; Lacasa, Lucas

2016-04-01

Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.
The Thiamin Pyrophosphate-Motif

NASA Technical Reports Server (NTRS)

Dominiak, P.; Ciszak, E.

2003-01-01

Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.
The Thiamin Pyrophosphate-Motif

NASA Technical Reports Server (NTRS)

Dominiak, Paulina M.; Ciszak, Ewa M.

2003-01-01

Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.
Occurrence probability of structured motifs in random sequences.

PubMed

Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

2002-01-01

The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.
Targeting of >1.5 Mb of Human DNA into the Mouse X Chromosome Reveals Presence of cis-Acting Regulators of Epigenetic Silencing

PubMed Central

Yang, Christine; McLeod, Andrea J.; Cotton, Allison M.; de Leeuw, Charles N.; Laprise, Stéphanie; Banks, Kathleen G.; Simpson, Elizabeth M.; Brown, Carolyn J.

2012-01-01

Regulatory sequences can influence the expression of flanking genes over long distances, and X chromosome inactivation is a classic example of cis-acting epigenetic gene regulation. Knock-ins directed to the Mus musculus Hprt locus offer a unique opportunity to analyze the spread of silencing into different human DNA sequences in the identical genomic environment. X chromosome inactivation of four knock-in constructs, including bacterial artificial chromosome (BAC) integrations of over 195 kb, was demonstrated by both the lack of expression from the inactive X chromosome in females with nonrandom X chromosome inactivation and promoter DNA methylation of the human transgene in females. We further utilized promoter DNA methylation to assess the inactivation status of 74 human reporter constructs comprising >1.5 Mb of DNA. Of the 47 genes examined, only the PHB gene showed female DNA hypomethylation approaching the level seen in males, and escape from X chromosome inactivation was verified by demonstration of expression from the inactive X chromosome. Integration of PHB resulted in lower DNA methylation of the flanking HPRT promoter in females, suggesting the action of a dominant cis-acting escape element. Female-specific DNA hypermethylation of CpG islands not associated with promoters implies a widespread imposition of DNA methylation during X chromosome inactivation; yet transgenes demonstrated differential capacities to accumulate DNA methylation when integrated into the identical location on the inactive X chromosome, suggesting additional cis-acting sequence effects. As only one of the human transgenes analyzed escaped X chromosome inactivation, we conclude that elements permitting ongoing expression from the inactive X are rare in the human genome. PMID:23023002
Development and utilization of complementary communication channels for treatment decision making and survivorship issues among cancer patients: The CIS Research Consortium Experience.

PubMed

Fleisher, Linda; Wen, Kuang Yi; Miller, Suzanne M; Diefenbach, Michael; Stanton, Annette L; Ropka, Mary; Morra, Marion; Raich, Peter C

2015-11-01

Cancer patients and survivors are assuming active roles in decision-making and digital patient support tools are widely used to facilitate patient engagement. As part of Cancer Information Service Research Consortium's randomized controlled trials focused on the efficacy of eHealth interventions to promote informed treatment decision-making for newly diagnosed prostate and breast cancer patients, and post-treatment breast cancer, we conducted a rigorous process evaluation to examine the actual use of and perceived benefits of two complementary communication channels -- print and eHealth interventions. The three Virtual Cancer Information Service (V-CIS) interventions were developed through a rigorous developmental process, guided by self-regulatory theory, informed decision-making frameworks, and health communications best practices. Control arm participants received NCI print materials; experimental arm participants received the additional V-CIS patient support tool. Actual usage data from the web-based V-CIS was also obtained and reported. Print materials were highly used by all groups. About 60% of the experimental group reported using the V-CIS. Those who did use the V-CIS rated it highly on improvements in knowledge, patient-provider communication and decision-making. The findings show that how patients actually use eHealth interventions either singularly or within the context of other communication channels is complex. Integrating rigorous best practices and theoretical foundations is essential and multiple communication approaches should be considered to support patient preferences.
Functional analysis of a viroid RNA motif mediating cell-to-cell movement in Nicotiana benthamiana.

PubMed

Jiang, Dongmei; Wang, Meng; Li, Shifang

2017-01-01

Cell-to-cell trafficking through different cellular layers is a key process for various RNAs including those of plant viruses and viroids, but the regulatory mechanisms involved are still not fully elucidated and good model systems are important. Here, we analyse the function of a simple RNA motif (termed 'loop19') in potato spindle tuber viroid (PSTVd) which is required for trafficking in Nicotiana benthamiana leaves. Northern blotting, reverse transcriptase PCR (RT-PCR) and in situ hybridization analyses demonstrated that unlike wild-type PSTVd, which was present in the nuclei in all cell types, the trafficking-defective loop19 mutants were visible only in the nuclei of upper epidermal and palisade mesophyll cells, which shows that PSTVd loop19 plays a role in mediating RNA trafficking from palisade to spongy mesophyll cells in N.benthamiana leaves. Our findings and approaches have broad implications for studying the RNA motifs mediating trafficking of RNAs across specific cellular boundaries in other biological systems.
Network perturbation by recurrent regulatory variants in cancer

PubMed Central

Cho, Ara; Lee, Insuk; Choi, Jung Kyoon

2017-01-01

Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928

Chronic treatment with 13-cis-retinoic acid changes aggressive behaviours in the resident-intruder paradigm in rats.

PubMed

Trent, Simon; Drew, Cheney J G; Mitchell, Paul J; Bailey, Sarah J

2009-12-01

Retinoids, vitamin A related compounds, have an established role in the development of the nervous system and are increasingly recognized to play a role in adult brain function. The synthetic retinoid, 13-cis-retinoic acid (13-cis-RA, Roaccutane) is widely used to treat severe acne but has been linked to an increased risk of neuropsychiatric side effects, including depression. Here we report that chronic administration with 13-cis-RA (1 mg/kg i.p. daily, 7-14 days) in adult rats reduced aggression- and increased flight-related behaviours in the resident-intruder paradigm. However, in the forced swim, sucrose consumption and open field tests treatment for up to 6 weeks with 13-cis-RA did not modify behaviour in adult or juvenile animals. The behavioural change observed in the resident-intruder paradigm is directly opposite to that observed with chronic antidepressant administration. These findings indicate that when a suitably sensitive behavioural test is employed then chronic administration of 13-cis-RA in adult rats induces behavioural changes consistent with a pro-depressant action.
Computational methods in sequence and structure prediction

NASA Astrophysics Data System (ADS)

Lang, Caiyi

This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed
RNA motif search with data-driven element ordering.

PubMed

Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

2016-05-18

In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

PubMed

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

2011-01-01

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

DOE PAGES

Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina; ...

2015-11-10

Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina

Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
ZFPL1, a novel ring finger protein required for cis-Golgi integrity and efficient ER-to-Golgi transport.

PubMed

Chiu, Chi-Fang; Ghanekar, Yashoda; Frost, Laura; Diao, Aipo; Morrison, Daniel; McKenzie, Eddie; Lowe, Martin

2008-04-09

The Golgi apparatus occupies a central position within the secretory pathway, but the molecular mechanisms responsible for its assembly and organization remain poorly understood. We report here the identification of zinc finger protein-like 1 (ZFPL1) as a novel structural component of the Golgi apparatus. ZFPL1 is a conserved and widely expressed integral membrane protein with two predicted zinc fingers at the N-terminus, the second of which is a likely ring domain. ZFPL1 directly interacts with the cis-Golgi matrix protein GM130. Depletion of ZFPL1 results in the accumulation of cis-Golgi matrix proteins in the intermediate compartment (IC) and the tubulation of cis-Golgi and IC membranes. Loss of ZFPL1 function also impairs cis-Golgi assembly following brefeldin A washout and slows the rate of cargo trafficking into the Golgi apparatus. Effects upon Golgi matrix protein localization and cis-Golgi structure can be rescued by wild-type ZFPL1 but not mutants defective in GM130 binding. Together, these data suggest that ZFPL1 has an important function in maintaining the integrity of the cis-Golgi and that it does so through interactions with GM130.
Klyuchevskaya, Volcano, Kamchatka Peninsula, CIS

NASA Technical Reports Server (NTRS)

1991-01-01

Klyuchevskaya, Volcano, Kamchatka Peninsula, CIS (56.0N, 160.5E) is one of several active volcanoes in the CIS and is 15,584 ft. in elevation. Fresh ash fall on the south side of the caldera can be seen as a dirty smudge on the fresh snowfall. Just to the north of the Kamchatka River is Shiveluch, a volcano which had been active a short time previously. There are more than 100 volcanic edifices recognized on Kamchatka, 15 of which are still active.
Identification of sequence motifs significantly associated with antisense activity.

PubMed

McQuisten, Kyle A; Peek, Andrew S

2007-06-07

Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. We discovered 155 motifs that associate significantly with high antisense suppression activity and 202 motifs that associate significantly with low suppression activity. The motifs range in length from 2 to 5 bases, contain several motifs that have been previously discovered as associating highly with antisense activity, and have thermodynamic properties consistent with previous work associating thermodynamic properties of sequences with their antisense activity. Statistical analysis revealed no correlation between a motif's position within an antisense sequence and that sequences antisense activity. Also, many significant motifs existed as subwords of other significant motifs. Support vector regression experiments indicated that the feature set of significant motifs increased correlation compared to all possible motifs as well as several subsets of the significant motifs. The thermodynamic properties of the significantly associated motifs support existing data correlating the thermodynamic properties of the antisense oligonucleotide with antisense efficiency, reinforcing our hypothesis that antisense suppression is strongly associated with probe/target thermodynamics, as there are no enzymatic mediators to speed the process along like the RNA Induced
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

PubMed

Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

2017-09-21

The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.
Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1.

PubMed

Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis

2003-11-01

The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.
Determination of 13-cis-retinoic acid and its major metabolite, 4-oxo-13-cis-retinoic acid, in human blood by reversed-phase high-performance liquid chromatography.

PubMed

Vane, F M; Stoltenborg, J K; Buggé, C J

1982-02-12

A high-performance liquid chromatography (HPLC) method for the quantitation of 13-cis-retinoic acid (13-cis-RA) and its major metabolite, 4-oxo-13-cis-RA, in human blood has been developed. The method includes extraction of 1 ml of blood with diethyl ether at pH 6 and the analysis of the extract by reversed-phase HPLC with solvent programming and detection at 365 nm. The quantitation ranges for 13-cis-RA and 4-oxo-13-cis-RA are 10--2000 and 50--2000 ng/ml of blood, respectively. The method also provides estimates of the concentrations of all-trans-RA and 4-oxo-all-trans-RA. The mean intra- and inter-assay variabilities for all four compounds were 6% or less. The method separates 13-cis-RA and 4-oxo-13-cis-RA from 9-cis-RA, all-trans-RA, 4-oxo-all-trans-RA, and some other possible metabolites, such as hydroxy and epoxy retinoic acids. The method has been successfully applied to the analyses of over 1200 blood samples from four 13-cis-RA clinical studies.
Characteristic motifs for families of allergenic proteins

PubMed Central

Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

2008-01-01

The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633
Landscape of histone modifications in a sponge reveals the origin of animal cis-regulatory complexity

PubMed Central

Gaiti, Federico; Jindrich, Katia; Fernandez-Valverde, Selene L; Roper, Kathrein E; Degnan, Bernard M; Tanurdžić, Miloš

2017-01-01

Combinatorial patterns of histone modifications regulate developmental and cell type-specific gene expression and underpin animal complexity, but it is unclear when this regulatory system evolved. By analysing histone modifications in a morphologically-simple, early branching animal, the sponge Amphimedonqueenslandica, we show that the regulatory landscape used by complex bilaterians was already in place at the dawn of animal multicellularity. This includes distal enhancers, repressive chromatin and transcriptional units marked by H3K4me3 that vary with levels of developmental regulation. Strikingly, Amphimedon enhancers are enriched in metazoan-specific microsyntenic units, suggesting that their genomic location is extremely ancient and likely to place constraints on the evolution of surrounding genes. These results suggest that the regulatory foundation for spatiotemporal gene expression evolved prior to the divergence of sponges and eumetazoans, and was necessary for the evolution of animal multicellularity. DOI: http://dx.doi.org/10.7554/eLife.22194.001 PMID:28395144
Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

PubMed Central

Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

2016-01-01

DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C. PMID:26782977
Teratogenicity and transplacental pharmacokinetics of 13-cis-retinoic acid in rabbits.

PubMed

Eckhoff, C; Chari, S; Kromka, M; Staudner, H; Juhasz, L; Rudiger, H; Agnish, N

1994-03-01

No embryotoxic or teratogenic effects, considered to be treatment related, were observed in rabbits after daily oral doses of 3 mg/kg of 13-cis-retinoic acid (13-cis-RA) from Day 8 to Day 11 of gestation. In contrast, treatment with 15 mg/kg/day significantly increased the rate of fetal resorptions (22%) and 13 out of 68 surviving fetuses (16%) were malformed. Pharmacokinetic studies with both dosing regimens of 13-cis-RA in pregnant rabbits showed that on Day 11 of gestation, high concentrations of parent compound, 13-cis-RA, and its major metabolite, 13-cis-4-oxoRA, existed in maternal plasma. Much lower concentrations were found for all-trans-4-oxoRA and all-trans-RA. The area under the concentration-time curve (AUC) of all-trans-RA following the 15 mg/kg/day dosing regimen of 13-cis-RA was only 1.2% that of parent compound 13-cis-RA. At this dose, embryo levels of 13-cis-RA, 13-cis-4-oxoRA, and all-trans-4-oxoRA were 2.5-, 4.7-, and 3.6-fold higher by AUC comparison (24-hr period of Day 11) compared with the dose of 3 mg/kg. However, embryo levels of all-trans-RA were virtually identical at both doses and were, in fact, somewhat lower than endogenous concentrations measured in untreated rabbit embryos. In contrast to mice, where isomerization from 13-cis- to all-trans-RA was suggested to be crucial for the teratogenic action of 13-cis-RA, we found that the teratogenic action of 13-cis-RA (15 mg/kg/day) in rabbits is characterized by increased whole embryo concentrations of 13-cis-RA, 13-cis-4-oxoRA, and all-trans-4-oxoRA, but not of all-trans-RA.
Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

PubMed Central

Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

2014-01-01

Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994
Cooxidation of 13-cis-retinoic acid by prostaglandin H synthase.

PubMed

Samokyszyn, V M; Sloane, B F; Honn, K V; Marnett, L J

1984-10-30

Cooxidative metabolism of 13-cis-retinoic acid (13-CIS) via prostaglandin H synthase was investigated employing ram seminal vesicle microsomes. Oxidation of 13-CIS utilizing H2O2, 13-hydroperoxy-9-cis-11-trans-octadecadienoic acid (13-OOH-18:2), or 1-hydroperoxy-5-phenyl-4-pentene was detected by measurement of O2 incorporation. UV spectroscopy and HPLC of extracted incubation mixtures demonstrated that 13-CIS was metabolized to oxidized derivatives. Similar spectral changes and HPLC profiles were obtained with H2O2, 13-OOH-18:2, or arachidonic acid as substrates. 4-Hydroxy-13-cis-retinoic acid and all trans-retinoic acid were products of cooxidation as well as other polar metabolites. Oxidation was inhibited by the antioxidant butylated hydroxyanisole and the spin trap, nitrosobenzene. These results indicate that 13-cis-retinoic acid is cooxidized by prostaglandin H synthase and suggest a free radical mechanism resembling that of lipid peroxidation.
Functional studies of the Ciona intestinalis myogenic regulatory factor reveal conserved features of chordate myogenesis.

PubMed

Izzi, Stephanie A; Colantuono, Bonnie J; Sullivan, Kelly; Khare, Parul; Meedel, Thomas H

2013-04-15

Ci-MRF is the sole myogenic regulatory factor (MRF) of the ascidian Ciona intestinalis, an invertebrate chordate. In order to investigate its properties we developed a simple in vivo assay based on misexpressing Ci-MRF in the notochord of Ciona embryos. We used this assay to examine the roles of three structural motifs that are conserved among MRFs: an alanine-threonine (Ala-Thr) dipeptide of the basic domain that is known in vertebrates as the myogenic code, a cysteine/histidine-rich (C/H) domain found just N-terminal to the basic domain, and a carboxy-terminal amphipathic α-helix referred to as Helix III. We show that the Ala-Thr dipeptide is necessary for normal Ci-MRF function, and that while eliminating the C/H domain or Helix III individually has no demonstrable effect on Ci-MRF, simultaneous loss of both motifs significantly reduces its activity. Our studies also indicate that direct interaction between CiMRF and an essential E-box of Ciona Troponin I is required for the expression of this muscle-specific gene and that multiple classes of MRF-regulated genes exist in Ciona. These findings are consistent with substantial conservation of MRF-directed myogenesis in chordates and demonstrate for the first time that the Ala/Thr dipeptide of the basic domain of an invertebrate MRF behaves as a myogenic code. Copyright © 2013 Elsevier Inc. All rights reserved.
Classification and assessment tools for structural motif discovery algorithms.

PubMed

Badr, Ghada; Al-Turaiki, Isra; Mathkour, Hassan

2013-01-01

Motif discovery is the problem of finding recurring patterns in biological data. Patterns can be sequential, mainly when discovered in DNA sequences. They can also be structural (e.g. when discovering RNA motifs). Finding common structural patterns helps to gain a better understanding of the mechanism of action (e.g. post-transcriptional regulation). Unlike DNA motifs, which are sequentially conserved, RNA motifs exhibit conservation in structure, which may be common even if the sequences are different. Over the past few years, hundreds of algorithms have been developed to solve the sequential motif discovery problem, while less work has been done for the structural case. In this paper, we survey, classify, and compare different algorithms that solve the structural motif discovery problem, where the underlying sequences may be different. We highlight their strengths and weaknesses. We start by proposing a benchmark dataset and a measurement tool that can be used to evaluate different motif discovery approaches. Then, we proceed by proposing our experimental setup. Finally, results are obtained using the proposed benchmark to compare available tools. To the best of our knowledge, this is the first attempt to compare tools solely designed for structural motif discovery. Results show that the accuracy of discovered motifs is relatively low. The results also suggest a complementary behavior among tools where some tools perform well on simple structures, while other tools are better for complex structures. We have classified and evaluated the performance of available structural motif discovery tools. In addition, we have proposed a benchmark dataset with tools that can be used to evaluate newly developed tools.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.