Science.gov

Sample records for gene sequences regulatory

  1. Modeling DNA sequence-based cis-regulatory gene networks.

    PubMed

    Bolouri, Hamid; Davidson, Eric H

    2002-06-01

    Gene network analysis requires computationally based models which represent the functional architecture of regulatory interactions, and which provide directly testable predictions. The type of model that is useful is constrained by the particular features of developmentally active cis-regulatory systems. These systems function by processing diverse regulatory inputs, generating novel regulatory outputs. A computational model which explicitly accommodates this basic concept was developed earlier for the cis-regulatory system of the endo16 gene of the sea urchin. This model represents the genetically mandated logic functions that the system executes, but also shows how time-varying kinetic inputs are processed in different circumstances into particular kinetic outputs. The same basic design features can be utilized to construct models that connect the large number of cis-regulatory elements constituting developmental gene networks. The ultimate aim of the network models discussed here is to represent the regulatory relationships among the genomic control systems of the genes in the network, and to state their functional meaning. The target site sequences of the cis-regulatory elements of these genes constitute the physical basis of the network architecture. Useful models for developmental regulatory networks must represent the genetic logic by which the system operates, but must also be capable of explaining the real time dynamics of cis-regulatory response as kinetic input and output data become available. Most importantly, however, such models must display in a direct and transparent manner fundamental network design features such as intra- and intercellular feedback circuitry; the sources of parallel inputs into each cis-regulatory element; gene battery organization; and use of repressive spatial inputs in specification and boundary formation. Successful network models lead to direct tests of key architectural features by targeted cis-regulatory analysis. PMID

  2. Identification of potential regulatory motifs in odorant receptor genes by analysis of promoter sequences

    PubMed Central

    Michaloski, Jussara S.; Galante, Pedro A.F.

    2006-01-01

    Mouse odorant receptors (ORs) are encoded by >1000 genes dispersed throughout the genome. Each olfactory neuron expresses one single OR gene, while the rest of the genes remain silent. The mechanisms underlying OR gene expression are poorly understood. Here, we investigated if OR genes share common cis-regulatory sequences in their promoter regions. We carried out a comprehensive analysis in which the upstream regions of a large number of OR genes were compared. First, using RLM-RACE, we generated cDNAs containing the complete 5′-untranslated regions (5′-UTRs) for a total number of 198 mouse OR genes. Then, we aligned these cDNA sequences to the mouse genome so that the 5′ structure and transcription start sites (TSSs) of the OR genes could be precisely determined. Sequences upstream of the TSSs were retrieved and browsed for common elements. We found DNA sequence motifs that are overrepresented in the promoter regions of the OR genes. Most motifs resemble O/E-like sites and are preferentially localized within 200 bp upstream of the TSSs. Finally, we show that these motifs specifically interact with proteins extracted from nuclei prepared from the olfactory epithelium, but not from brain or liver. Our results show that the OR genes share common promoter elements. The present strategy should provide information on the role played by cis-regulatory sequences in OR gene regulation. PMID:16902085

  3. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    PubMed

    Petrovski, Slavé; Gussow, Ayal B; Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H; Allen, Andrew S; Goldstein, David B

    2015-09-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, nc

  4. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  5. Phylogenetic Relationships and the Evolution of Regulatory Gene Sequences in the Parrotfishes

    PubMed Central

    Smith, Lydia L.; Fessler, Jennifer L.; Alfaro, Michael E.; Streelman, J. Todd; Westneat, Mark W.

    2008-01-01

    Regulatory genes control the expression of other genes and are key components of developmental processes such as segmentation and embryonic construction of the skull in vertebrates. Here we examine the variability and evolution of three vertebrate regulatory genes, addressing issues of their utility for phylogenetics and comparing the rates of genetic change seen in regulatory loci to the rates seen in other genes in the parrotfishes. The parrotfishes are a diverse group of colorful fishes from coral reefs and seagrasses worldwide and have been placed phylogenetically within the family Labridae. We tested phylogenetic hypotheses among the parrotfishes, with a focus on the genera Chlorurus and Scarus, by analyzing eight gene fragments for 42 parrotfishes and eight outgroup species. We sequenced mitochondrial 12s rRNA (967 bp), 16s rRNA (577 bp), and cytochrome b (477 bp). From the nuclear genome, we sequenced part of the protein-coding genes rag2 (715 bp), tmo4c4 (485 bp), and the developmental regulatory genes otx1 (672 bp), bmp4 (488 bp), and dlx2 (522 bp). Bayesian, likelihood, and parsimony analyses on the resulting 4903 bp of DNA sequence produced similar topologies that confirm the monophyly of the scarines and provide a phylogeny at the species level for portions of the genera Scarus and Chlorurus. Four major clades of Scarus were recovered, with three distributed in the Indo-Pacific and one containing Caribbean/Atlantic taxa. Molecular rates suggest a Miocene origin of the parrotfishes (22 mya) and a recent divergence of species within Scarus and Chlorurus, within the past 5 million years. Developmentally important genes made a significant contribution to phylogenetic structure, and rates of genetic evolution were high in bmp4, similar to other coding nuclear genes, but low in otx1 and the dlx2 exons. Synonymous and nonsynonymous substitution patterns in developmental regulatory genes support the hypothesis of stabilizing selection during the history of

  6. Two Lamprey Hedgehog Genes Share Non-Coding Regulatory Sequences and Expression Patterns with Gnathostome Hedgehogs

    PubMed Central

    Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-01-01

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences. PMID:20967201

  7. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  8. Detecting Functional Divergence after Gene Duplication through Evolutionary Changes in Posttranslational Regulatory Sequences

    PubMed Central

    Nguyen Ba, Alex N.; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L.; Landry, Christian R.; Moses, Alan M.

    2014-01-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. PMID:25474245

  9. A transcriptional regulatory element in the coding sequence of the human Bcl-2 gene

    PubMed Central

    Lang, Georgina; Gombert, Wendy M; Gould, Hannah J

    2005-01-01

    We investigated the protein-binding sites in a DNAse I hypersensitive site associated with bcl-2 gene expression in human B cells. We mapped this hypersensitive site to the coding sequence of exon 2 of the bcl-2 gene in the bcl-2-expressing REH B-cell line. Electrophoretic mobility shift assays (EMSAs) with extracts from REH cells revealed three previously unrecognized B-Myb-binding sites in this sequence. The protein was identified as B-Myb by using a specific antibody and EMSAs. Accordingly, the levels of B-Myb and bcl-2 proteins, and of Myb EMSA activity, were correlated over a wide range of cell lines, representing different stages of B-cell development. Transfection of REH cells with antisense B-myb down-regulated EMSA activity and the level of bcl-2, and led to the apoptosis of REH cells. Transfection of the bcl-2-non-expressing RPMI 8226 cell line with a B-Myb expression vector induced B-Myb EMSA activity and the expression of bcl-2. Reporter assays indicated that the HSS8 sequence containing the three B-Myb sites may act as an enhancer when it is linked to the bcl-2 gene promoter. Interaction of B-Myb with HSS8 may enhance bcl-2 gene expression by co-operating with positive regulatory elements (e.g. previously identified B-Myb response elements) or silencing negative response elements in the bcl-2 gene promoter. PMID:15606792

  10. Regulatory sequences of Arabidopsis drive reporter gene expression in nematode feeding structures.

    PubMed Central

    Barthels, N; van der Lee, F M; Klap, J; Goddijn, O J; Karimi, M; Puzio, P; Grundler, F M; Ohl, S A; Lindsey, K; Robertson, L; Robertson, W M; Van Montagu, M; Gheysen, G; Sijmons, P C

    1997-01-01

    In the quest for plant regulatory sequences capable of driving nematode-triggered effector gene expression in feeding structures, we show that promoter tagging is a valuable tool. A large collection of transgenic Arabidopsis plants was generated. They were transformed with a beta-glucuronidase gene functioning as a promoter tag. Three T-DNA constructs, pGV1047, p delta gusBin19, and pMOG553, were used. Early responses to nematode invasion were of primary interest. Six lines exhibiting beta-glucuronidase activity in syncytia induced by the beet cyst nematode were studied. Reporter gene activation was also identified in galls induced by root knot and ectoparasitic nematodes. Time-course studies revealed that all six tags were differentially activated during the development of the feeding structure. T-DNA-flanking regions responsible for the observed responses after nematode infection were isolated and characterized for promoter activity. PMID:9437858

  11. Origins of Transcriptional Transition: Balance between Upstream and Downstream Regulatory Gene Sequences

    PubMed Central

    Sala, Adrien; Shoaib, Muhammad; Anufrieva, Olga; Mutharasu, Gnanavel; Yli-Harja, Olli

    2015-01-01

    ABSTRACT By measuring individual mRNA production at the single-cell level, we investigated the lac promoter’s transcriptional transition during cell growth phases. In exponential phase, variation in transition rates generates two mixed phenotypes, low and high numbers of mRNAs, by modulating their burst frequency and sizes. Independent activation of the regulatory-gene sequence does not produce bimodal populations at the mRNA level, but bimodal populations are produced when the regulatory gene is activated coordinately with the upstream and downstream region promoter sequence (URS and DRS, respectively). Time-lapse microscopy of mRNAs for lac and a variant lac promoter confirm this observation. Activation of the URS/DRS elements of the promoter reveals a counterplay behavior during cell phases. The promoter transition rate coupled with cell phases determines the mRNA and transcriptional noise. We further show that bias in partitioning of RNA does not lead to phenotypic switching. Our results demonstrate that the balance between the URS and the DRS in transcriptional regulation determines population diversity. PMID:25626902

  12. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  13. Sequence analysis of the myosin regulatory light chain gene of the vestimentiferan Riftia pachyptila.

    PubMed

    Ravaux, J; Hassanin, A; Deutsch, J; Gaill, F; Markmann-Mulisch, U

    2001-01-24

    We have isolated and characterized a cDNA (DNA complementary to RNA) clone (Rf69) from the vestimentiferan Riftia pachyptila. The cDNA insert consists of 1169 base pairs. The aminoacid sequence deduced from the longest reading frame is 193 residues in length, and clearly characterized it as a myosin regulatory light chain (RLC). The RLC primary structure is described in relation to its function in muscle contraction. The comparison with other RLCs suggested that Riftia myosin is probably regulated through its RLC either by phosphorylation like the vertebrate smooth muscle myosins, and/or by Ca2+-binding like the mollusk myosins. Riftia RLC possesses a N-terminal extension lacking in all other species besides the earthworm Lumbricus terrestris. Aminoacid sequence comparisons with a number of RLCs from vertebrates and invertebrates revealed a relatively high identity score (64%) between Riftia RLC and the homologous gene from Lumbricus. The relationships between the members of the myosin RLCs were examined by two phylogenetic methods, i.e. distance matrix and maximum parsimony. The resulting trees depict the grouping of the RLCs according to their role in myosin activity regulation. In all trees, Riftia RLC groups with RLCs that depend on Ca2+-binding for myosin activity regulation. PMID:11223252

  14. Coordinate cytokine regulatory sequences

    DOEpatents

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  15. Different regulatory sequences control creatine kinase-M gene expression in directly injected skeletal and cardiac muscle.

    PubMed Central

    Vincent, C K; Gualberto, A; Patel, C V; Walsh, K

    1993-01-01

    Regulatory sequences of the M isozyme of the creatine kinase (MCK) gene have been extensively mapped in skeletal muscle, but little is known about the sequences that control cardiac-specific expression. The promoter and enhancer sequences required for MCK gene expression were assayed by the direct injection of plasmid DNA constructs into adult rat cardiac and skeletal muscle. A 700-nucleotide fragment containing the enhancer and promoter of the rabbit MCK gene activated the expression of a downstream reporter gene in both muscle tissues. Deletion of the enhancer significantly decreased expression in skeletal muscle but had no detectable effect on expression in cardiac muscle. Further deletions revealed a CArG sequence motif at position -179 within the promoter that was essential for cardiac-specific expression. The CArG element of the MCK promoter bound to the recombinant serum response factor and YY1, transcription factors which control expression from structurally similar elements in the skeletal actin and c-fos promoters. MCK-CArG-binding activities that were similar or identical to serum response factor and YY1 were also detected in extracts from adult cardiac muscle. These data suggest that the MCK gene is controlled by different regulatory programs in adult cardiac and skeletal muscle. Images PMID:8423791

  16. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

    PubMed

    Besemer, J; Lomsadze, A; Borodovsky, M

    2001-06-15

    Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-coding regions and models of regulatory sites near gene start within an iterative Hidden Markov model based algorithm. The new gene prediction method, called GeneMarkS, utilizes a non-supervised training procedure and can be used for a newly sequenced prokaryotic genome with no prior knowledge of any protein or rRNA genes. The GeneMarkS implementation uses an improved version of the gene finding program GeneMark.hmm, heuristic Markov models of coding and non-coding regions and the Gibbs sampling multiple alignment program. GeneMarkS predicted precisely 83.2% of the translation starts of GenBank annotated Bacillus subtilis genes and 94.4% of translation starts in an experimentally validated set of Escherichia coli genes. We have also observed that GeneMarkS detects prokaryotic genes, in terms of identifying open reading frames containing real genes, with an accuracy matching the level of the best currently used gene detection methods. Accurate translation start prediction, in addition to the refinement of protein sequence N-terminal data, provides the benefit of precise positioning of the sequence region situated upstream to a gene start. Therefore, sequence motifs related to transcription and translation regulatory sites can be revealed and analyzed with higher precision. These motifs were shown to possess a significant variability, the functional and evolutionary connections of which are discussed. PMID:11410670

  17. Nucleotide sequence of the regulatory locus controlling expression of bacterial genes for bioluminescence.

    PubMed Central

    Engebrecht, J; Silverman, M

    1987-01-01

    Production of light by the marine bacterium Vibrio fischeri and by recombinant hosts containing cloned lux genes is controlled by the density of the culture. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes and two closely linked divergent promoters. As part of a genetic analysis to understand the regulation of bioluminescence, we have sequenced the region of DNA containing this control circuit. Open reading frames corresponding to luxR and luxI were identified; transcription start sites were defined by S1 nuclease mapping and sequences resembling promoter elements were located. Images PMID:3697093

  18. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  19. Cloning and Characterization of 5′ Flanking Regulatory Sequences of AhLEC1B Gene from Arachis Hypogaea L.

    PubMed Central

    Tang, Guiying; Xu, Pingli; Liu, Wei; Liu, Zhanji; Shan, Lei

    2015-01-01

    LEAFY COTYLEDON1 (LEC1) is a B subunit of Nuclear Factor Y (NF-YB) transcription factor that mainly accumulates during embryo development. We cloned the 5′ flanking regulatory sequence of AhLEC1B gene, a homolog of Arabidopsis LEC1, and analyzed its regulatory elements using online software. To identify the crucial regulatory region, we generated a series of GUS expression frameworks driven by different length promoters with 5′ terminal and/or 3′ terminal deletion. We further characterized the GUS expression patterns in the transgenic Arabidopsis lines. Our results show that both the 65bp proximal promoter region and the 52bp 5′ UTR of AhLEC1B contain the key motifs required for the essential promoting activity. Moreover, AhLEC1B is preferentially expressed in the embryo and is co-regulated by binding of its upstream genes with both positive and negative corresponding cis-regulatory elements. PMID:26426444

  20. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published. PMID:18495751

  1. RefNetBuilder: a platform for construction of integrated reference gene regulatory networks from expressed sequence tags

    PubMed Central

    2011-01-01

    Background Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs). Results RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG). It features sequence alignment tools such as BLAST to allow mapping ESTs to pathways and GRNs in model organisms. A scoring algorithm was incorporated to rank and select the best match for each query EST. We validated RefNetBuilder using DNA sequences of Caenorhabditis elegans, a model organism having manually curated KEGG pathways. Using the earthworm Eisenia fetida as an example, we demonstrated the functionalities and features of RefNetBuilder. Conclusions The RefNetBuilder provides a standalone application for building reference GRNs for non-model organisms on a number of operating system platforms with standard desktop computer hardware. As a new bioinformatic tool aimed for constructing putative GRNs for non-model organisms that have only ESTs available, RefNetBuilder is especially useful to explore pathway- or network-related information in these organisms. PMID:22166047

  2. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  3. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. PMID:26025428

  4. Transcriptional activation of the fra-1 gene by AP-1 is mediated by regulatory sequences in the first intron.

    PubMed Central

    Bergers, G; Graninger, P; Braselmann, S; Wrighton, C; Busslinger, M

    1995-01-01

    Constitutive expression of c-Fos, FosB, Fra-1, or c-Jun in rat fibroblasts leads to up-regulation of the immediate-early gene fra-1. Using the posttranslational FosER induction system, we demonstrate that this AP-1-dependent stimulation of fra-1 expression is rapid, depends on a functional DNA-binding domain of FosER, and is a general phenomenon observed in different cell types. In vitro mutagenesis and functional analysis of the rat fra-1 gene in stably transfected Rat-1A-FosER fibroblasts indicated that basal and AP-1-regulated expression of the fra-1 gene depends on regulatory sequences in the first intron which comprise a consensus AP-1 site and two AP-1-like elements. We have also investigated the transactivating and transforming properties of the Fra-1 protein to address the significance of fra-1 up-regulation. The entire Fra-1 protein fused to the DNA-binding domain of Ga14 is shown to lack any transactivation function, and yet it possesses oncogenic potential, as overexpression of Fra-1 in established rat fibroblasts results in anchorage-independent growth in vitro and tumor development in athymic mice, fra-1 is therefore not only induced by members of the Fos family, but its gene product may also contribute to cellular transformation by these proteins. Together, these data identify fra-1 as a unique member of the fos gene family which is under positive control by AP-1 activity. PMID:7791782

  5. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    PubMed Central

    2011-01-01

    useful markers for population genetics studies and marker-assisted selection. Conclusion We have produced the first comprehensive transcriptome-wide analysis in A. auriculiformis and A. mangium using de novo assembly techniques. Our high quality and comprehensive assemblies allowed the identification of many genes in the lignin biosynthesis and secondary cell wall formation in Acacia hybrids. Our results demonstrated that Next Generation Sequencing is a cost-effective method for gene discovery, identification of regulatory sequences, and informative markers in a non-model plant. PMID:21729267

  6. Plant nitrogen regulatory P-PII genes

    DOEpatents

    Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun

    2001-01-01

    The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the

  7. [Cloning and function identification of gene 'admA' and up-stream regulatory sequence related to antagonistic activity of Enterobacter cloacae B8].

    PubMed

    Zhu, Jun-Li; Li, De-Bao; Yu, Xu-Ping

    2012-04-01

    To reveal the antagonistic mechanism of B8 strain to Xanthomonas oryzae pv. oryzae, transposon tagging method and chromosome walking were deployed to clone antagonistic related fragments around Tn5 insertion site in the mutant strain B8B. The function of up-stream regulatory sequence of gene 'admA' involved in the antagonistic activity was further identified by gene knocking out technique. An antagonistic related left fragment of Tn5 insertion site, 2 608 bp in length, was obtained by tagging with Kan resistance gene of Tn5. A 2 354 bp right fragment of Tn5 insertion site was amplified with 2 rounds of chromosome walking. The length of the B contig around the Tn5 insertion site was 4 611 bp, containing 7 open reading frames (ORFs). Bioinformatic analysis revealed that these ORFs corresponded to the partial coding regions of glyceraldehyde-3-phosphate dehydrogenase, two LysR family transcriptional regulators, hypothetical protein VSWAT3-20465 of Vibrionales and admA, admB, and partial sequence of admC gene of Pantoea agglomerans biosynthetic gene cluster, respectively. Tn5 was inserted in the up-stream of 200 bp or 894 bp of the sequence corresponding to anrP ORF or admA gene on B8B, respectively. The B-1 and B-2 mutants that lost antagonistic activity were selected by homeologuous recombination technology in association with knocking out plasmid pMB-BG. These results suggested that the transcription and expression of anrP gene might be disrupted as a result of the knocking out of up-stream regulatory sequence by Tn5 in B8B strain, further causing biosythesis regulation of the antagonistic related gene cluster. Thus, the antagonistic related genes in B8 strain is a gene family similar as andrimid biosynthetic gene cluster, and the upstream regulatory region appears to be critical for the antibiotics biosynthesis. PMID:22522167

  8. The upstream regulatory sequence of the light harvesting complex Lhcf2 gene of the marine diatom Phaeodactylum tricornutum enhances transcription in an orientation- and distance-independent fashion.

    PubMed

    Russo, Monia Teresa; Annunziata, Rossella; Sanges, Remo; Ferrante, Maria Immacolata; Falciatore, Angela

    2015-12-01

    Diatoms are a key phytoplankton group in the contemporary ocean, showing extraordinary adaptation capacities to rapidly changing environments. The recent availability of whole genome sequences from representative species has revealed distinct features in their genomes, like novel combinations of genes encoding distinct metabolisms and a significant number of diatom-specific genes. However, the regulatory mechanisms driving diatom gene expression are still largely uncharacterized. Considering the wide variety of fields of study orbiting diatoms, ranging from ecology, evolutionary biology to biotechnology, it is thus essential to increase our understanding of fundamental gene regulatory processes such as transcriptional regulation. To this aim, we explored the functional properties of the 5'-flanking region of the Phaeodatylum tricornutum Lhcf2 gene, encoding a member of the Light Harvesting Complex superfamily and we showed that this region enhances transcription of a GUS reporter gene in an orientation- and distance-independent fashion. This represents the first example of a cis-regulatory sequence with enhancer-like features discovered in diatoms and it is instrumental for the generation of novel genetic tools and diatom exploitation in different areas of study. PMID:26117181

  9. Epithelial and endothelial expression of the green fluorescent protein reporter gene under the control of bovine prion protein (PrP) gene regulatory sequences in transgenic mice

    NASA Astrophysics Data System (ADS)

    Lemaire-Vieille, Catherine; Schulze, Tobias; Podevin-Dimster, Valérie; Follet, Jérome; Bailly, Yannick; Blanquet-Grossard, Françoise; Decavel, Jean-Pierre; Heinen, Ernst; Cesbron, Jean-Yves

    2000-05-01

    The expression of the cellular form of the prion protein (PrPc) gene is required for prion replication and neuroinvasion in transmissible spongiform encephalopathies. The identification of the cell types expressing PrPc is necessary to understanding how the agent replicates and spreads from peripheral sites to the central nervous system. To determine the nature of the cell types expressing PrPc, a green fluorescent protein reporter gene was expressed in transgenic mice under the control of 6.9 kb of the bovine PrP gene regulatory sequences. It was shown that the bovine PrP gene is expressed as two populations of mRNA differing by alternative splicing of one 115-bp 5' untranslated exon in 17 different bovine tissues. The analysis of transgenic mice showed reporter gene expression in some cells that have been identified as expressing PrP, such as cerebellar Purkinje cells, lymphocytes, and keratinocytes. In addition, expression of green fluorescent protein was observed in the plexus of the enteric nervous system and in a restricted subset of cells not yet clearly identified as expressing PrP: the epithelial cells of the thymic medullary and the endothelial cells of both the mucosal capillaries of the intestine and the renal capillaries. These data provide valuable information on the distribution of PrPc at the cellular level and argue for roles of the epithelial and endothelial cells in the spread of infection from the periphery to the brain. Moreover, the transgenic mice described in this paper provide a model that will allow for the study of the transcriptional activity of the PrP gene promoter in response to scrapie infection.

  10. Cloning and sequencing of the blood meal-induced late trypsin gene from the mosquito Aedes aegypti and characterization of the upstream regulatory region.

    PubMed

    Barillas-Mury, C; Wells, M A

    1993-01-01

    A 4.1 kb genomic clone of the late trypsin gene from the mosquito Aedes aegypti was isolated, mapped and subcloned. A 1.6 kb subclone, corresponding to 1.1 kb of upstream regulatory region and 0.5 kb of coding region, was sequenced. The gene has no introns within the coding region. The 5' end of the mature mRNA was mapped using primer extension analysis. A TATA box consensus sequence (TATAAA) was found at position -31 from the 5' end of the mature mRNA. A cluster of five repeat sequences homologous to the yeast GCN4 DNA binding site was found within 200 nucleotides upstream of the cap site. GCN4 is required for derepression mediated control of general amino acid biosynthesis in response to amino acid starvation in yeast. It activates the transcription of at least twenty different genes coding for enzymes involved in amino acid biosynthesis. The presence of this cluster of consensus sequences suggests that a protein similar to GCN4 might regulate expression of the late trypsin gene in the mosquito. Southern blot analysis of genomic DNA indicates that late trypsin is a single copy gene. PMID:9087537

  11. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans

    PubMed Central

    Khachatoorian, Careen; Judelson, Howard S.

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies. PMID:26716454

  12. Inactivation, sequence, and lacZ fusion analysis of a regulatory locus required for repression of nitrogen fixation genes in Rhodobacter capsulatus.

    PubMed Central

    Kranz, R G; Pace, V M; Caldicott, I M

    1990-01-01

    Transcription of the genes that code for proteins involved in nitrogen fixation in free-living diazotrophs is typically repressed by high internal oxygen concentrations or exogenous fixed nitrogen. The DNA sequence of a regulatory locus required for repression of Rhodobacter capsulatus nitrogen fixation genes was determined. It was shown that this locus, defined by Tn5 insertions and by ethyl methanesulfonate-derived mutations, is homologous to the glnB gene of other organisms. The R. capsulatus glnB gene was upstream of glnA, the gene for glutamine synthetase, in a glnBA operon. beta-Galactosidase expression from an R. capsulatus glnBA-lacZ translational fusion was increased twofold in cells induced by nitrogen limitation relative to that in cells under nitrogen-sufficient conditions. R. capsulatus nifR1, a gene that was previously shown to be homologous to ntrC and that is required for transcription of nitrogen fixation genes, was responsible for approximately 50% of the transcriptional activation of this glnBA fusion in cells induced under nitrogen-limiting conditions. R. capsulatus GLNB, NIFR1, and NIFR2 (a protein homologous to NTRB) were proposed to transduce the nitrogen status in the cell into repression or activation of other R. capsulatus nif genes. Repression of nif genes in response to oxygen was still present in R. capsulatus glnB mutants and must have occurred at a different level of control in the regulatory circuit. Images FIG. 4 FIG. 5 PMID:2152916

  13. Transgenic LacZ under control of Hec-6st regulatory sequences recapitulates endogenous gene expression on high endothelial venules

    PubMed Central

    Liao, Shan; Bentley, Kevin; Lebrun, Marielle; Lesslauer, Werner; Ruddle, Frank H.; Ruddle, Nancy H.

    2007-01-01

    Hec-6st is a highly specific high endothelial venule (HEV) gene that is crucial for regulating lymphocyte homing to lymph nodes (LN). The enzyme is also expressed in HEV-like vessels in tertiary lymphoid organs that form in chronic inflammation in autoimmunity, graft rejection, and microbial infection. Understanding the molecular nature of Hec-6st regulation is crucial for elucidating its function in development and disease. However, studies of HEV are limited because of the difficulties in isolating and maintaining the unique characteristics of these vessels in vitro. The novel pClasper yeast homologous recombination technique was used to isolate from a BAC clone a 60-kb DNA fragment that included the Hec-6st (Chst4) gene with flanking sequences. Transgenic mice were generated with the β-galactosidase (LacZ) reporter gene inserted in-frame in the exon II of Hec-6st within the isolated BAC DNA fragment. LacZ was expressed specifically on HEV in LN, as indicated by its colocalization with peripheral node vascular addressin. LacZ was increased in nasal-associated lymphoid tissue during development and was reduced in LN and nasal-associated lymphoid tissue by LTβR-Ig (lymphotoxin-β receptor human Ig fusion protein) treatment in a manner identical to the endogenous gene. The transgene was expressed at high levels in lymphoid accumulations with characteristics of tertiary lymphoid organs in the salivary glands of aged mice. Thus, the Hec-6s-LacZ construct faithfully reproduces Hec-6st tissue-specific expression and can be used in further studies to drive expression of reporter or effector genes, which could visualize or inhibit HEV in autoimmunity. PMID:17360566

  14. Building Developmental Gene Regulatory Networks

    PubMed Central

    Li, Enhu; Davidson, Eric H.

    2009-01-01

    Animal development is an elaborate process programmed by genomic regulatory instructions. Regulatory genes encode transcription factors and signal molecules, and their expression is under the control of cis-regulatory modules that define the logic of transcriptional responses to the inputs of other regulatory genes. The functional linkages amongst regulatory genes constitute the gene regulatory networks (GRNs) that govern cell specification and patterning in development. Constructing such networks requires identification of the regulatory genes involved and characterization of their temporal and spatial expression patterns. Interactions (activation/repression) among transcription factors or signals can be investigated by large-scale perturbation analysis, in which the function of each gene is specifically blocked. Resultant expression changes are then integrated to identify direct linkages, and to reveal the structure of the GRN. Predicted GRN linkages can be tested and verified by cis-regulatory analysis. The explanatory power of the GRN was shown in the lineage specification of sea urchin endomesoderm. Acquiring such networks is essential for a systematic and mechanistic understanding of the developmental process. PMID:19530131

  15. Gene regulatory mechanisms underpinning prostate cancer susceptibility.

    PubMed

    Whitington, Thomas; Gao, Ping; Song, Wei; Ross-Adams, Helen; Lamb, Alastair D; Yang, Yuehong; Svezia, Ilaria; Klevebring, Daniel; Mills, Ian G; Karlsson, Robert; Halim, Silvia; Dunning, Mark J; Egevad, Lars; Warren, Anne Y; Neal, David E; Grönberg, Henrik; Lindberg, Johan; Wei, Gong-Hong; Wiklund, Fredrik

    2016-04-01

    Molecular characterization of genome-wide association study (GWAS) loci can uncover key genes and biological mechanisms underpinning complex traits and diseases. Here we present deep, high-throughput characterization of gene regulatory mechanisms underlying prostate cancer risk loci. Our methodology integrates data from 295 prostate cancer chromatin immunoprecipitation and sequencing experiments with genotype and gene expression data from 602 prostate tumor samples. The analysis identifies new gene regulatory mechanisms affected by risk locus SNPs, including widespread disruption of ternary androgen receptor (AR)-FOXA1 and AR-HOXB13 complexes and competitive binding mechanisms. We identify 57 expression quantitative trait loci at 35 risk loci, which we validate through analysis of allele-specific expression. We further validate predicted regulatory SNPs and target genes in prostate cancer cell line models. Finally, our integrated analysis can be accessed through an interactive visualization tool. This analysis elucidates how genome sequence variation affects disease predisposition via gene regulatory mechanisms and identifies relevant genes for downstream biomarker and drug development. PMID:26950096

  16. Variations in the coding and regulatory sequences of the angiogenin (ANG) gene are not associated to ALS (amyotrophic lateral sclerosis) in the Italian population.

    PubMed

    Corrado, Lucia; Battistini, Stefania; Penco, Silvana; Bergamaschi, Laura; Testa, Lucia; Ricci, Claudia; Giannini, Fabio; Greco, Giuseppe; Patrosso, Maria Cristina; Pileggi, Simona; Causarano, Renzo; Mazzini, Letizia; Momigliano-Richiardi, Patricia; D'Alfonso, Sandra

    2007-07-15

    Potentially causative missense variations in the ANG gene and a positive association with the synonymous rs11701-G substitution was detected mainly in Irish and Scottish ALS patients. We screened 262 Italian SOD1 negative ALS patients (250 sporadic) and 415 matched controls for sequence variations in the coding, 3'/5' UTR and 5' flanking (642 bp) regions of the ANG gene. We identified 53 sequence variations of which 46 new, 20 with a minor allele frequency (MAF) >or=0.01 and only three localised in the coding sequence, namely the missense I46V, identified in one patient and two controls, and the synonymous G86G and T97T corresponding to rs11701 and rs2228653. None of the detected SNPs or of their haplotypic combinations was significantly associated with ALS susceptibility or clinical features. In conclusion, we did not detect the association with rs11701-G or with any other newly detected variation in the ANG regulatory region. Furthermore we did not identify potentially causal mutations in the coding region. PMID:17462671

  17. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease.

    PubMed

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-05-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  18. Organization, regulatory sequences, and alternatively spliced transcripts of the mucosal addressin cell adhesion molecule-1 (MAdCAM-1) gene

    SciTech Connect

    Sampaio, S.O.; Mei, C.; Butcher, E.C.

    1995-09-01

    The mucosal addressin cell adhesion molecule-1 (MAdCAM-1) is expressed selectively at venular sites of lymphocyte extravasation into mucosal lymphoid tissues and lamina propria, where it directs local lymphocyte trafficking. MAdCAM-1 is a multifunctional type I transmembrane adhesion molecule comprising two distal Ig domains involved in {alpha}4{beta}7 integrin binding, a mucin-like region able to display L-selectin-binding carbohydrates, and a membrane-proximal Ig domain homologous to IgA. We show in this work that the MAdCAM-1 gene is located on chromosome 10 and contains five exons. The signal peptide and each one of the three Ig domains are encoded by a distinct exon, whereas the transmembrane, cytoplasmic tail, and 3{prime}-untranslated region of MAdCAM-1 are combined on a single exon. The mucin-like region and the third Ig domain are encoded together on exon 4. An alternatively spliced MAdCAM-1 mRNA is identified that lacks the mucin/IgA-homologous exon 4-encoded sequences. This short variant of MAdCAM-1 may be specialized to support {alpha}4{beta}7-dependent adhesion strengthening, independent of carbohydrate-presenting function. Sequences 5{prime} of the transcription start site include tandem nuclear factor-KB sites; AP-1, AP-2, and signal peptide-1 binding sites; and an estrogen response element. Our findings reinforce the correspondence between the multidomain structure and versatile functions of this vascular addressin, and suggest an additional level of regulation of carbohydrate-presenting capability, and thus of its importance in lectin-mediated vs. {alpha}4{beta}7-dependent adhesive events in lymphocyte trafficking. 46 refs., 6 figs., 1 tab.

  19. Chicken interferon consensus sequence-binding protein (ICSBP) and interferon regulatory factor (IRF) 1 genes reveal evolutionary conservation in the IRF gene family.

    PubMed Central

    Jungwirth, C; Rebbert, M; Ozato, K; Degen, H J; Schultz, U; Dawid, I B

    1995-01-01

    Members of the IRF family mediate transcriptional responses to interferons (IFNs) and to virus infection. So far, proteins of this family have been studied only among mammalian species. Here we report the isolation of cDNA clones encoding two members of this family from chicken, interferon consensus sequence-binding protein (ICSBP) and IRF-1. The predicted chicken ICSBP and IRF-1 proteins show high levels of sequence similarity to their corresponding human and mouse counterparts. Sequence identities in the putative DNA-binding domains of chicken and human ICSBP and IRF-1 were 97% and 89%, respectively, whereas the C-terminal regions showed identities of 64% and 51%; sequence relationships with mouse ICSBP and IRF-1 are very similar. Chicken ICSBP was found to be expressed in several embryonic tissues, and both chicken IRF-1 and ICSBP were strongly induced in chicken fibroblasts by IFN treatment, supporting the involvement of these factors in IFN-regulated gene expression. The presence of proteins homologous to mammalian IRF family members, together with earlier observations on the occurrence of functionally homologous IFN-responsive elements in chicken and mammalian genes, highlights the conservation of transcriptional mechanisms in the IFN system, a finding that contrasts with the extensive sequence and functional divergence of the IFNs. Images Fig. 3 Fig. 4 Fig. 5 PMID:7536924

  20. Vision from next generation sequencing: Multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease

    PubMed Central

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-01-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of “gene” itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  1. Identification of DVA Interneuron Regulatory Sequences in Caenorhabditis elegans

    PubMed Central

    Puckett Robinson, Carmie; Schwarz, Erich M.; Sternberg, Paul W.

    2013-01-01

    Background The identity of each neuron is determined by the expression of a distinct group of genes comprising its terminal gene battery. The regulatory sequences that control the expression of such terminal gene batteries in individual neurons is largely unknown. The existence of a complete genome sequence for C. elegans and draft genomes of other nematodes let us use comparative genomics to identify regulatory sequences directing expression in the DVA interneuron. Methodology/Principal Findings Using phylogenetic comparisons of multiple Caenorhabditis species, we identified conserved non-coding sequences in 3 of 10 genes (fax-1, nmr-1, and twk-16) that direct expression of reporter transgenes in DVA and other neurons. The conserved region and flanking sequences in an 85-bp intronic region of the twk-16 gene directs highly restricted expression in DVA. Mutagenesis of this 85 bp region shows that it has at least four regions. The central 53 bp region contains a 29 bp region that represses expression and a 24 bp region that drives broad neuronal expression. Two short flanking regions restrict expression of the twk-16 gene to DVA. A shared GA-rich motif was identified in three of these genes but had opposite effects on expression when mutated in the nmr-1 and twk-16 DVA regulatory elements. Conclusions/Significance We identified by multi-species conservation regulatory regions within three genes that direct expression in the DVA neuron. We identified four contiguous regions of sequence of the twk-16 gene enhancer with positive and negative effects on expression, which combined to restrict expression to the DVA neuron. For this neuron a single binding site may thus not achieve sufficient specificity for cell specific expression. One of the positive elements, an 8-bp sequence required for expression was identified in silico by sequence comparisons of seven nematode species, demonstrating the potential resolution of expanded multi-species phylogenetic comparisons. PMID

  2. The kil-kor regulon of broad-host-range plasmid RK2: nucleotide sequence, polypeptide product, and expression of regulatory gene korC.

    PubMed Central

    Kornacki, J A; Burlage, R S; Figurski, D H

    1990-01-01

    Broad-host-range plasmid RK2 encodes several kil operons (kilA, kilB, kilC, kilE) whose expression is potentially lethal to Escherichia coli host cells. The kil operons and the RK2 replication initiator gene (trfA) are coregulated by various combinations of kor genes (korA, korB, korC, korE). This regulatory network is called the kil-kor regulon. Presented here are studies on the structure, product, and expression of korC. Genetic mapping revealed the precise location of korC in a region near transposon Tn1. We determined the nucleotide sequence of this region and identified the korC structural gene by analysis of korC mutants. Sequence analysis predicts the korC product to be a polypeptide of 85 amino acids with a molecular mass of 9,150 daltons. The KorC polypeptide was identified in vivo by expressing wild-type and mutant korC alleles from a bacteriophage T7 RNA polymerase-dependent promoter. The predicted structure of KorC polypeptide has a net positive charge and a helix-turn-helix region similar to those of known DNA-binding proteins. These properties are consistent with the repressorlike function of KorC protein, and we discuss the evidence that KorA and KorC proteins act as corepressors in the control of the kilC and kilE operons. Finally, we show that korC is expressed from the bla promoters within the upstream transposon Tn1, suggesting that insertion of Tn1 interrupted a plasmid operon that may have originally included korC and kilC. Images PMID:2160936

  3. Comparative inter-strain sequence analysis of the putative regulatory region of murine psychostimulant-regulated gene GNB1 (G protein beta 1 subunit gene).

    PubMed

    Kitanaka, Nobue; Kitanaka, Junichi; Walther, Donna; Wang, Xiao-Bing; Uhl, George R

    2003-08-01

    We isolated a cDNA clone from a murine genomic library of C57BL/6 strain, carrying 13.8 kb of nucleotides including exon 1 of heterotrimeric GTP-binding protein beta 1 subunit gene (genetic symbol, GNB1) and 10.6 kb of the 5' flanking region. Sequence comparison with GNB1 gene locus from 129Sv strain revealed a 0.2% divergence in a 13.2 kb common region between these two strains. The divergence consisted of eight single nucleotide polymorphisms, three insertions and one deletion, with 129Sv used as the reference. The exon 1 and the putative regulation elements, such as cyclic AMP response element, AP1, AP2, Sp1 and nuclear factor-kappa B recognition sites, were perfectly conserved. The expression of GNB1 mRNA was significantly increased in mouse striatum 2 h after single methamphetamine administration with an approximately 150% expression level compared with the basal level. In contrast, no change in the expression level was observed in the cerebral cortex. After the chronic methamphetamine treatment regimen, the expression level of GNB1 mRNA did not change in any brain regions examined. These results suggest (1) that the 5' flanking nucleotide sequence of GNB1 gene was strictly conserved for its possible contribution to the same change in the expression level between the mouse strains in response to psychostimulants and (2) that the initial process of development of behavioral sensitization appeared to occur parallel to the significant increase in the expression level of GNB1 gene in the mouse striatum. PMID:14631649

  4. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks. PMID:27326708

  5. Distinct gene expression patterns in skeletal and cardiac muscle are dependent on common regulatory sequences in the MLC1/3 locus.

    PubMed Central

    McGrew, M J; Bogdanova, N; Hasegawa, K; Hughes, S H; Kitsis, R N; Rosenthal, N

    1996-01-01

    The myosin light-chain 1/3 locus (MLC1/3) is regulated by two promoters and a downstream enhancer element which produce two protein isoforms in fast skeletal muscle at distinct stages of mouse embryogenesis. We have analyzed the expression of transcripts from the internal MLC3 promoter and determined that it is also expressed in the atria of the heart. Expression from the MLC3 promoter in these striated muscle lineages is differentially regulated during development. In transgenic mice, the MLC3 promoter is responsible for cardiac-specific reporter gene expression while the downstream enhancer augments expression in skeletal muscle. Examination of the methylation status of endogenous and transgenic promoter and enhancer elements indicates that the internal promoter is not regulated in a manner similar to that of the MLC1 promoter or the downstream enhancer. A GATA protein consensus sequence in the proximal MLC3 promoter but not the MLC1 promoter binds with high affinity to GATA-4, a cardiac muscle- and gut-specific transcription factor. Mutation of either the MEF2 or GATA motifs in the MLC3 promoter attenuates its activity in both heart and skeletal muscles, demonstrating that MLC3 expression in these two diverse muscle types is dependent on common regulatory elements. PMID:8754853

  6. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  7. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  8. Deep transcriptome sequencing reveals the expression of key functional and regulatory genes involved in the abiotic stress signaling pathways in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Drought, salt and cold are the major abiotic stresses that limit the rice production and cause serious threat to food security. The identification of the key functional and regulatory genes in the abiotic stress signaling pathways is important for understanding the molecular basis of abiotic stress ...

  9. Complementation of nitrogen-regulatory (ntr-like) mutations in Rhodobacter capsulatus by an Escherichia coli gene: cloning and sequencing of the gene and characterization of the gene product.

    PubMed Central

    Allibert, P; Willison, J C; Vignais, P M

    1987-01-01

    In vivo genetic engineering by R' plasmid formation was used to isolate an Escherichia coli gene that restored the Ntr+ phenotype to Ntr- mutants of the photosynthetic bacterium Rhodobacter capsulatus (formerly Rhodopseudomonas capsulata; J. F. Imhoff, H. G. Trüper, and N. Pfenning, Int. J. Syst. Bacteriol. 34:340-343, 1984). Nucleotide sequencing of the gene revealed no homology to the ntr genes of Klebsiella pneumoniae. Furthermore, hybridization experiments between the cloned gene and different F' plasmids indicated that the gene is located between 34 and 39 min on the E. coli genetic map and is therefore unlinked to the known ntr genes. The molecular weight of the gene product, deduced from the nucleotide sequence, was 30,563. After the gene was cloned in an expression vector, the gene product was purified. It was shown to have a pI of 5.8 and to behave as a dimer during gel filtration and on sucrose density gradients. Antibodies raised against the purified protein revealed the presence of this protein in R. capsulatus strains containing the E. coli gene, but not in other strains. Moreover, elimination of the plasmid carrying the E. coli gene from complemented strains resulted in the loss of the Ntr+ phenotype. Complementation of the R. capsulatus mutations by the E. coli gene therefore occurs in trans and results from the synthesis of a functional gene product. Images PMID:3025172

  10. The complete sequence of the human CD79b (Ig{beta}/B29) gene: Identification of a conserved exon/intron organization, immunoglobulin-like regulatory regions, and allelic polymorphism

    SciTech Connect

    Hashimoto, S.; Chiorazzi, N.; Gregersen, P.K. |

    1994-12-31

    We determined the complete genomic sequence of the human CD79b (Ig{beta}/B29) gene. The CD79b gene product is associated with the membrane immunoglobulin signaling complex which is composed of immunoglobulin (Ig) itself, associated in a noncovalent fashion with CD79b and a second polypeptide chain, CD79a (Ig{alpha}/mb1). The sequence and exon/intron organization of the human and mouse CD79b genes are highly similar. The gene organization suggests that some variant forms of CD79b may arise by virtue of alternative splicing of mRNA. In addition, a number of conserved regulatory sequences commonly found in Ig genes are present in sequences which flank the human CD79b gene. Some of these sequences are distinct from those found in the CD79a promoter. These differences may explain why transcription of CD79b, but not CD79a, is observed in plasma cells. A new Taq 1 restriction fragment length polymorphism is described that is not associated with any structural polymorphisms of the expressed CD79b polypeptide. 13 refs., 3 figs., 1 tab.

  11. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  12. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis1[OPEN

    PubMed Central

    Guo, Xu Qiu; Adams, Keith L.

    2015-01-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. PMID:26474639

  13. Transcription factor trapping by RNA in gene regulatory elements.

    PubMed

    Sigova, Alla A; Abraham, Brian J; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M; Guo, Yang Eric; Jangi, Mohini; Giallourakis, Cosmas C; Sharp, Phillip A; Young, Richard A

    2015-11-20

    Transcription factors (TFs) bind specific sequences in promoter-proximal and -distal DNA elements to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF Yin-Yang 1 (YY1) binds to both gene regulatory elements and their associated RNA species across the entire genome. Reduced transcription of regulatory elements diminishes YY1 occupancy, whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive-feedback loop that contributes to the stability of gene expression programs. PMID:26516199

  14. Transcription factor trapping by RNA in gene regulatory elements

    PubMed Central

    Sigova, Alla A.; Abraham, Brian J.; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M.; Eric Guo, Yang; Jangi, Mohini; Giallourakis, Cosmas C.; Sharp, Phillip A.; Young, Richard A.

    2016-01-01

    Transcription factors (TFs) bind specific sequences in promoter-proximal and distal DNA elements in order to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA-binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF YY1 binds to both gene regulatory elements and also to their associated RNA species genome-wide. Reduced transcription of regulatory elements diminishes YY1 occupancy whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive feedback loop that contributes to the stability of gene expression programs. PMID:26516199

  15. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    SciTech Connect

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  16. Interrogating Transcriptional Regulatory Sequences in Tol2-Mediated Xenopus Transgenics

    PubMed Central

    Loots, Gabriela G.; Bergmann, Anne; Hum, Nicholas R.; Oldenburg, Catherine E.; Wills, Andrea E.; Hu, Na; Ovcharenko, Ivan; Harland, Richard M.

    2013-01-01

    Identifying gene regulatory elements and their target genes in vertebrates remains a significant challenge. It is now recognized that transcriptional regulatory sequences are critical in orchestrating dynamic controls of tissue-specific gene expression during vertebrate development and in adult tissues, and that these elements can be positioned at great distances in relation to the promoters of the genes they control. While significant progress has been made in mapping DNA binding regions by combining chromatin immunoprecipitation and next generation sequencing, functional validation remains a limiting step in improving our ability to correlate in silico predictions with biological function. We recently developed a computational method that synergistically combines genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to predict tissue-specific enhancers in the human genome. We applied this method to 270 genes highly expressed in skeletal muscle and predicted 190 putative cis-regulatory modules. Furthermore, we optimized Tol2 transgenic constructs in Xenopus laevis to interrogate 20 of these elements for their ability to function as skeletal muscle-specific transcriptional enhancers during embryonic development. We found 45% of these elements expressed only in the fast muscle fibers that are oriented in highly organized chevrons in the Xenopus laevis tadpole. Transcription factor binding site analysis identified >2 Mef2/MyoD sites within ∼200 bp regions in 6 of the validated enhancers, and systematic mutagenesis of these sites revealed that they are critical for the enhancer function. The data described herein introduces a new reporter system suitable for interrogating tissue-specific cis-regulatory elements which allows monitoring of enhancer activity in real time, throughout early stages of embryonic development, in Xenopus. PMID:23874664

  17. Characterization of DNA sequences that mediate nuclear protein binding to the regulatory region of the Pisum sativum (pea) chlorophyl a/b binding protein gene AB80: identification of a repeated heptamer motif.

    PubMed

    Argüello, G; García-Hernández, E; Sánchez, M; Gariglio, P; Herrera-Estrella, L; Simpson, J

    1992-05-01

    Two protein factors binding to the regulatory region of the pea chlorophyl a/b binding protein gene AB80 have been identified. One of these factors is found only in green tissue but not in etiolated or root tissue. The second factor (denominated ABF-2) binds to a DNA sequence element that contains a direct heptamer repeat TCTCAAA. It was found that presence of both of the repeats is essential for binding. ABF-2 is present in both green and etiolated tissue and in roots and factors analogous to ABF-2 are present in several plant species. Computer analysis showed that the TCTCAAA motif is present in the regulatory region of several plant genes. PMID:1303797

  18. Two regulatory proteins that bind to the basic transcription element (BTE), a GC box sequence in the promoter region of the rat P-4501A1 gene.

    PubMed Central

    Imataka, H; Sogawa, K; Yasumoto, K; Kikuchi, Y; Sasano, K; Kobayashi, A; Hayami, M; Fujii-Kuriyama, Y

    1992-01-01

    The cDNAs for two DNA binding proteins of BTE, a GC box sequence in the promoter region of the P-450IA1(CYP1A1) gene, have been isolated from a rat liver cDNA library by using the BTE sequence as a binding probe. While one is for the rat equivalent to human Sp1, the other encodes a primary structure of 244 amino acids, a novel DNA binding protein designated BTEB. Both proteins contain a zinc finger domain of Cys-Cys/His-His motif that is repeated three times with sequence similarity of 72% to each other, otherwise they share little or no similarity. The function of BTEB was analysed by transfection of plasmids expressing BTEB and/or Sp1 with appropriate reporter plasmids into a monkey cell line CV-1 and compared with Sp1. BTEB and Sp1 activated the expression of genes with repeated GC box sequences in promoters such as the simian virus 40 early promoter and the human immunodeficiency virus-1 long terminal repeat promoter. In contrast, BTEB repressed the activity of a promoter containing BTE, a single GC box of the CYP1A1 gene that is stimulated by Sp1. When the BTE sequence was repeated five times, however, BTEB turned out to be an activator of the promoter. RNA blot analysis showed that mRNAs for BTEB and Sp1 were expressed in all tissues tested, but their concentrations varied independently in tissues. The former mRNA was rich in the brain, kidney, lung and testis, while the latter was relatively abundant in the thymus and spleen.(ABSTRACT TRUNCATED AT 250 WORDS) Images PMID:1356762

  19. Modeling of hysteresis in gene regulatory networks.

    PubMed

    Hu, J; Qin, K R; Xiang, C; Lee, T H

    2012-08-01

    Hysteresis, observed in many gene regulatory networks, has a pivotal impact on biological systems, which enhances the robustness of cell functions. In this paper, a general model is proposed to describe the hysteretic gene regulatory network by combining the hysteresis component and the transient dynamics. The Bouc-Wen hysteresis model is modified to describe the hysteresis component in the mammalian gene regulatory networks. Rigorous mathematical analysis on the dynamical properties of the model is presented to ensure the bounded-input-bounded-output (BIBO) stability and demonstrates that the original Bouc-Wen model can only generate a clockwise hysteresis loop while the modified model can describe both clockwise and counter clockwise hysteresis loops. Simulation studies have shown that the hysteresis loops from our model are consistent with the experimental observations in three mammalian gene regulatory networks and two E.coli gene regulatory networks, which demonstrate the ability and accuracy of the mathematical model to emulate natural gene expression behavior with hysteresis. A comparison study has also been conducted to show that this model fits the experiment data significantly better than previous ones in the literature. The successful modeling of the hysteresis in all the five hysteretic gene regulatory networks suggests that the new model has the potential to be a unified framework for modeling hysteresis in gene regulatory networks and provide better understanding of the general mechanism that drives the hysteretic function. PMID:22588784

  20. Evolving Robust Gene Regulatory Networks

    PubMed Central

    Noman, Nasimul; Monjo, Taku; Moscato, Pablo; Iba, Hitoshi

    2015-01-01

    Design and implementation of robust network modules is essential for construction of complex biological systems through hierarchical assembly of ‘parts’ and ‘devices’. The robustness of gene regulatory networks (GRNs) is ascribed chiefly to the underlying topology. The automatic designing capability of GRN topology that can exhibit robust behavior can dramatically change the current practice in synthetic biology. A recent study shows that Darwinian evolution can gradually develop higher topological robustness. Subsequently, this work presents an evolutionary algorithm that simulates natural evolution in silico, for identifying network topologies that are robust to perturbations. We present a Monte Carlo based method for quantifying topological robustness and designed a fitness approximation approach for efficient calculation of topological robustness which is computationally very intensive. The proposed framework was verified using two classic GRN behaviors: oscillation and bistability, although the framework is generalized for evolving other types of responses. The algorithm identified robust GRN architectures which were verified using different analysis and comparison. Analysis of the results also shed light on the relationship among robustness, cooperativity and complexity. This study also shows that nature has already evolved very robust architectures for its crucial systems; hence simulation of this natural process can be very valuable for designing robust biological systems. PMID:25616055

  1. Comparative studies of gene regulatory mechanisms.

    PubMed

    Pai, Athma A; Gilad, Yoav

    2014-12-01

    It has become increasingly clear that changes in gene regulation have played an important role in adaptive evolution both between and within species. Over the past five years, comparative studies have moved beyond simple characterizations of differences in gene expression levels within and between species to studying variation in regulatory mechanisms. We still know relatively little about the precise chain of events that lead to most regulatory adaptations, but we have taken significant steps towards understanding the relative importance of changes in different mechanisms of gene regulatory evolution. In this review, we first discuss insights from comparative studies in model organisms, where the available experimental toolkit is extensive. We then focus on a few recent comparative studies in primates, where the limited feasibility of experimental manipulation dictates the approaches that can be used to study gene regulatory evolution. PMID:25215415

  2. Genetic relatedness of Clostridium difficile isolates from various origins determined by triple-locus sequence analysis based on toxin regulatory genes tcdC, tcdR, and cdtR.

    PubMed

    Bouvet, Philippe J M; Popoff, Michel R

    2008-11-01

    A triple-locus nucleotide sequence analysis based on toxin regulatory genes tcdC, tcdR and cdtR was initiated to assess the sequence variability of these genes among Clostridium difficile isolates and to study the genetic relatedness between isolates. A preliminary investigation of the variability of the tcdC gene was done with 57 clinical and veterinary isolates. Twenty-three isolates representing nine main clusters were selected for tcdC, tcdR, and cdtR analysis. The numbers of alleles found for tcdC, tcdR and cdtR were nine, six, and five, respectively. All strains possessed the cdtR gene except toxin A-negative toxin B-positive variants. All but one binary toxin CDT-positive isolate harbored a deletion (>1 bp) in the tcdC gene. The combined analyses of the three genes allowed us to distinguish five lineages correlated with the different types of deletion in tcdC, i.e., 18 bp (associated or not with a deletion at position 117), 36 bp, 39 bp, and 54 bp, and with the wild-type tcdC (no deletion). The tcdR and tcdC genes, though located within the same pathogenicity locus, were found to have evolved separately. Coevolution of the three genes was noted only with strains harboring a 39-bp or a 54-bp deletion in tcdC that formed two homogeneous, separate divergent clusters. Our study supported the existence of the known clones (PCR ribotype 027 isolates and toxin A-negative toxin B-positive C. difficile variants) and evidence for clonality of isolates with a 39-bp deletion (toxinotype V, PCR ribotype 078) that are frequently isolated worldwide from human infections and from food animals. PMID:18832125

  3. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  4. Dynamic chromatin: the regulatory domain organization of eukaryotic gene loci.

    PubMed

    Bonifer, C; Hecht, A; Saueressig, H; Winter, D M; Sippel, A E

    1991-10-01

    It is hypothesized that nuclear DNA is organized in topologically constrained loop domains defining basic units of higher order chromatin structure. Our studies are performed in order to investigate the functional relevance of this structural subdivision of eukaryotic chromatin for the control of gene expression. We used the chicken lysozyme gene locus as a model to examine the relation between chromatin structure and gene function. Several structural features of the lysozyme locus are known: the extension of the region of general DNAasel sensitivity of the active gene, the location of DNA-sequences with high affinity for the nuclear matrix in vitro, and the position of DNAasel hypersensitive chromatin sites (DHSs). The pattern of DHSs changes depending on the transcriptional status of the gene. Functional studies demonstrated that DHSs mark the position of cis-acting regulatory elements. Additionally, we discovered a novel cis-activity of the border regions of the DNAasel sensitive domain (A-elements). By eliminating the position effect on gene expression usually observed when genes are randomly integrated into the genome after transfection, A-elements possibly serve as punctuation marks for a regulatory chromatin domain. Experiments using transgenic mice confirmed that the complete structurally defined lysozyme gene domain behaves as an independent regulatory unit, expressing the gene in a tissue specific and position independent manner. These expression features were lost in transgenic mice carrying a construct, in which the A-elements as well as an upstream enhancer region were deleted, indicating the lack of a locus activation function on this construct. Experiments are designed in order to uncover possible hierarchical relationships between the different cis-acting regulatory elements for stepwise gene activation during cell differentiation. We are aiming at the definition of the basic structural and functional requirements for position independent and high

  5. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila.

    PubMed

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R

    2016-09-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  6. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila

    PubMed Central

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R.

    2016-01-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  7. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  8. The distribution of SNPs in human gene regulatory regions

    PubMed Central

    Guo, Yongjian; Jamison, D Curtis

    2005-01-01

    Background As a result of high-throughput genotyping methods, millions of human genetic variants have been reported in recent years. To efficiently identify those with significant biological functions, a practical strategy is to concentrate on variants located in important sequence regions such as gene regulatory regions. Results Analysis of the most common type of variant, single nucleotide polymorphisms (SNPs), shows that in gene promoter regions more SNPs occur in close proximity to transcriptional start sites than in regions further upstream, and a disproportionate number of those SNPs represent nucleotide transversions. Additionally, the number of SNPs found in the predicted transcription factor binding sites is higher than in non-binding site sequences. Conclusion Current information about transcription factor binding site sequence patterns may not be exhaustive, and SNPs may be actively involved in influencing gene expression by affecting the transcription factor binding sites. PMID:16209714

  9. Bioinformatic identification of novel regulatory DNA sequence motifs in Streptomyces coelicolor

    PubMed Central

    Studholme, David J; Bentley, Stephen D; Kormanec, Jan

    2004-01-01

    Background Streptomyces coelicolor is a bacterium with a vast repertoire of metabolic functions and complex systems of cellular development. Its genome sequence is rich in genes that encode regulatory proteins to control these processes in response to its changing environment. We wished to apply a recently published bioinformatic method for identifying novel regulatory sequence signals to gain new insights into regulation in S. coelicolor. Results The method involved production of position-specific weight matrices from alignments of over-represented words of DNA sequence. We generated 2497 weight matrices, each representing a candidate regulatory DNA sequence motif. We scanned the genome sequence of S. coelicolor against each of these matrices. A DNA sequence motif represented by one of the matrices was found preferentially in non-coding sequences immediately upstream of genes involved in polysaccharide degradation, including several that encode chitinases. This motif (TGGTCTAGACCA) was also found upstream of genes encoding components of the phosphoenolpyruvate phosphotransfer system (PTS). We hypothesise that this DNA sequence motif represents a regulatory element that is responsive to availability of carbon-sources. Other motifs of potential biological significance were found upstream of genes implicated in secondary metabolism (TTAGGTtAGgCTaACCTAA), sigma factors (TGACN19TGAC), DNA replication and repair (ttgtCAGTGN13TGGA), nucleotide conversions (CTACgcNCGTAG), and ArsR (TCAGN12TCAG). A motif found upstream of genes involved in chromosome replication (TGTCagtgcN7Tagg) was similar to a previously described motif found in UV-responsive promoters. Conclusions We successfully applied a recently published in silico method to identify conserved sequence motifs in S. coelicolor that may be biologically significant as regulatory elements. Our data are broadly consistent with and further extend data from previously published studies. We invite experimental testing of

  10. Consensus gene regulatory networks: combining multiple microarray gene expression datasets

    NASA Astrophysics Data System (ADS)

    Peeling, Emma; Tucker, Allan

    2007-09-01

    In this paper we present a method for modelling gene regulatory networks by forming a consensus Bayesian network model from multiple microarray gene expression datasets. Our method is based on combining Bayesian network graph topologies and does not require any special pre-processing of the datasets, such as re-normalisation. We evaluate our method on a synthetic regulatory network and part of the yeast heat-shock response regulatory network using publicly available yeast microarray datasets. Results are promising; the consensus networks formed provide a broader view of the potential underlying network, obtaining an increased true positive rate over networks constructed from a single data source.

  11. Massive contribution of transposable elements to mammalian regulatory sequences.

    PubMed

    Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam

    2016-09-01

    Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes. PMID:27174439

  12. Latent phenotypes pervade gene regulatory circuits

    PubMed Central

    2014-01-01

    Background Latent phenotypes are non-adaptive byproducts of adaptive phenotypes. They exist in biological systems as different as promiscuous enzymes and genome-scale metabolic reaction networks, and can give rise to evolutionary adaptations and innovations. We know little about their prevalence in the gene expression phenotypes of regulatory circuits, important sources of evolutionary innovations. Results Here, we study a space of more than sixteen million three-gene model regulatory circuits, where each circuit is represented by a genotype, and has one or more functions embodied in one or more gene expression phenotypes. We find that the majority of circuits with single functions have latent expression phenotypes. Moreover, the set of circuits with a given spectrum of functions has a repertoire of latent phenotypes that is much larger than that of any one circuit. Most of this latent repertoire can be easily accessed through a series of small genetic changes that preserve a circuit’s main functions. Both circuits and gene expression phenotypes that are robust to genetic change are associated with a greater number of latent phenotypes. Conclusions Our observations suggest that latent phenotypes are pervasive in regulatory circuits, and may thus be an important source of evolutionary adaptations and innovations involving gene regulation. PMID:24884746

  13. Developmental cis-regulatory analysis of the cyclin D gene in the sea urchin Strongylocentrotus purpuratus

    PubMed Central

    McCarty, Christopher M.

    2013-01-01

    Cyclin D genes regulate the cell cycle, growth and differentiation in response to intercellular signaling. While the promoters of vertebrate cyclin D genes have been analyzed, the cis-regulatory sequences across an entire cyclin D locus have not. Doing so would increase understanding of how cyclin D genes respond to the regulatory states established by developmental gene regulatory networks, linking cell cycle and growth control to the ontogenetic program. Therefore, we conducted a cis-regulatory analysis on the cyclin D gene, SpcycD, of the sea urchin, Strongylocentrotus purpuratus, during embryogenesis, identifying upstream and intronic sequences, located within six defined regions bearing one or more cis-regulatory modules each. PMID:24090975

  14. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. PMID:27241757

  15. Organization and sequence of the human alpha-lactalbumin gene.

    PubMed Central

    Hall, L; Emery, D C; Davies, M S; Parker, D; Craig, R K

    1987-01-01

    A recombinant bacteriophage containing the entire alpha-lactalbumin gene was isolated from a human genomic library constructed in bacteriophage lambda L47. Within this recombinant the 2.5 kb alpha-lactalbumin gene is flanked by about 5 kb of sequence on either side. The complete nucleotide sequence of the gene and its immediate flanking sequences were determined and compared with those of the rat alpha-lactalbumin gene. These studies showed that the size, organization and sequence of the exons have been highly conserved, whereas the introns have diverged considerably. In particular, the first intron of the human gene was found to contain an Alu repetitive sequence not present in the rat. A high degree of homology (67%) was also observed in the 5' flanking regions, extending as far as 655 nucleotide residues upstream of the transcriptional initiation site. Comparison of the 5' flanking sequences of these two alpha-lactalbumin genes with those of five casein genes has revealed the presence of a highly conserved region [consensus sequence: RGAAGRAAA(N)TGGACAGAAATCAA(CG)TTTCTA], extending from position -140 to -110 in all seven sequences examined, suggesting a possible regulatory role in the hormonal control or tissue-specific expression of milk protein genes in the mammary gland. Images Fig. 1. PMID:2954544

  16. Gene regulatory networks and the underlying biology of developmental toxicity

    EPA Science Inventory

    Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...

  17. Marine organism cell biology and regulatory sequence discoveryin comparative functional genomics.

    PubMed

    Barnes, David W; Mattingly, Carolyn J; Parton, Angela; Dowell, Lori M; Bayne, Christopher J; Forrest, John N

    2004-10-01

    The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure-function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5'-upstream, 3'-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention. PMID:19003267

  18. Mutational Robustness of Gene Regulatory Networks

    PubMed Central

    van Dijk, Aalt D. J.; van Mourik, Simon; van Ham, Roeland C. H. J.

    2012-01-01

    Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor – target gene interactions) but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive). In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence. PMID:22295094

  19. Autonomous Boolean modeling of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Socolar, Joshua; Sun, Mengyang; Cheng, Xianrui

    2014-03-01

    In cases where the dynamical properties of gene regulatory networks are important, a faithful model must include three key features: a network topology; a functional response of each element to its inputs; and timing information about the transmission of signals across network links. Autonomous Boolean network (ABN) models are efficient representations of these elements and are amenable to analysis. We present an ABN model of the gene regulatory network governing cell fate specification in the early sea urchin embryo, which must generate three bands of distinct tissue types after several cell divisions, beginning from an initial condition with only two distinct cell types. Analysis of the spatial patterning problem and the dynamics of a network constructed from available experimental results reveals that a simple mechanism is at work in this case. Supported by NSF Grant DMS-10-68602

  20. Stabilizing gene regulatory networks through feedforward loops

    NASA Astrophysics Data System (ADS)

    Kadelka, C.; Murrugarra, D.; Laubenbacher, R.

    2013-06-01

    The global dynamics of gene regulatory networks are known to show robustness to perturbations in the form of intrinsic and extrinsic noise, as well as mutations of individual genes. One molecular mechanism underlying this robustness has been identified as the action of so-called microRNAs that operate via feedforward loops. We present results of a computational study, using the modeling framework of stochastic Boolean networks, which explores the role that such network motifs play in stabilizing global dynamics. The paper introduces a new measure for the stability of stochastic networks. The results show that certain types of feedforward loops do indeed buffer the network against stochastic effects.

  1. Inference of Splicing Regulatory Activities by Sequence Neighborhood Analysis

    PubMed Central

    Stadler, Michael B; Shomron, Noam; Yeo, Gene W; Schneider, Aniket; Xiao, Xinshu; Burge, Christopher B

    2006-01-01

    Sequence-specific recognition of nucleic-acid motifs is critical to many cellular processes. We have developed a new and general method called Neighborhood Inference (NI) that predicts sequences with activity in regulating a biochemical process based on the local density of known sites in sequence space. Applied to the problem of RNA splicing regulation, NI was used to predict hundreds of new exonic splicing enhancer (ESE) and silencer (ESS) hexanucleotides from known human ESEs and ESSs. These predictions were supported by cross-validation analysis, by analysis of published splicing regulatory activity data, by sequence-conservation analysis, and by measurement of the splicing regulatory activity of 24 novel predicted ESEs, ESSs, and neutral sequences using an in vivo splicing reporter assay. These results demonstrate the ability of NI to accurately predict splicing regulatory activity and show that the scope of exonic splicing regulatory elements is substantially larger than previously anticipated. Analysis of orthologous exons in four mammals showed that the NI score of ESEs, a measure of function, is much more highly conserved above background than ESE primary sequence. This observation indicates a high degree of selection for ESE activity in mammalian exons, with surprisingly frequent interchangeability between ESE sequences. PMID:17121466

  2. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks

    PubMed Central

    Singhal, Amit; Kumar, Pavanish; de Libero, Gennaro; Poidinger, Michael; Monterola, Christopher

    2015-01-01

    Human gene regulatory networks (GRN) can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs). Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data) accompanying this manuscript. PMID:26393364

  3. Repetitive sequence environment distinguishes housekeeping genes

    PubMed Central

    Eller, C. Daniel; Regelson, Moira; Merriman, Barry; Nelson, Stan; Horvath, Steve; Marahrens, York

    2007-01-01

    Housekeeping genes are expressed across a wide variety of tissues. Since repetitive sequences have been reported to influence the expression of individual genes, we employed a novel approach to determine whether housekeeping genes can be distinguished from tissue-specific genes their repetitive sequence context. We show that Alu elements are more highly concentrated around housekeeping genes while various longer (>400-bp) repetitive sequences ("repeats"), including Long Interspersed Nuclear Element 1 (LINE-1) elements, are excluded from these regions. We further show that isochore membership does not distinguish housekeeping genes from tissue-specific genes and that repetitive sequence environment distinguishes housekeeping genes from tissue-specific genes in every isochore. The distinct repetitive sequence environment, in combination with other previously published sequence properties of housekeeping genes, were used to develop a method of predicting housekeeping genes on the basis of DNA sequence alone. Using expression across tissue types as a measure of success, we demonstrate that repetitive sequence environment is by far the most important sequence feature identified to date for distinguishing housekeeping genes. PMID:17141428

  4. Beyond antioxidant genes in the ancient NRF2 regulatory network

    PubMed Central

    Lacher, Sarah E.; Lee, Joslynn S.; Wang, Xuting; Campbell, Michelle R.; Bell, Douglas A.; Slattery, Matthew

    2016-01-01

    NRF2, a basic leucine zipper transcription factor encoded by the gene NFE2L2, is a master regulator of the transcriptional response to oxidative stress. NRF2 is structurally and functionally conserved from insects to humans, and it heterodimerizes with the small MAF transcription factors to bind a consensus DNA sequence (the antioxidant response element, or ARE) and regulate gene expression. We have used genome-wide chromatin immunoprecipitation (ChIP-seq) and gene expression data to identify direct NRF2 target genes in Drosophila and humans. These data have allowed us to construct the deeply conserved ancient NRF2 regulatory network – target genes that are conserved from Drosophila to human. The ancient network consists of canonical antioxidant genes, as well as genes related to proteasomal pathways, metabolism, and a number of less expected genes. We have also used enhancer reporter assays and electrophoretic mobility shift assays to confirm NRF2-mediated regulation of ARE (antioxidant response element) activity at a number of these novel target genes. Interestingly, the ancient network also highlights a prominent negative feedback loop; this, combined with the finding that and NRF2-mediated regulatory output is tightly linked to the quality of the ARE it is targeting, suggests that precise regulation of nuclear NRF2 concentration is necessary to achieve proper quantitative regulation of distinct gene sets. Together, these findings highlight the importance of balance in the NRF2-ARE pathway, and indicate that NRF2-mediated regulation of xenobiotic metabolism, glucose metabolism, and proteostasis have been central to this pathway since its inception. PMID:26163000

  5. Integrating heterogeneous gene expression data for gene regulatory network modelling.

    PubMed

    Sîrbu, Alina; Ruskin, Heather J; Crane, Martin

    2012-06-01

    Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets. PMID:21948152

  6. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    SciTech Connect

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A. . E-mail: BELL1@niehs.nih.gov

    2005-09-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes.

  7. A Provisional Gene Regulatory Atlas for Mouse Heart Development

    PubMed Central

    Chen, Hailin; VanBuren, Vincent

    2014-01-01

    Congenital Heart Disease (CHD) is one of the most common birth defects. Elucidating the molecular mechanisms underlying normal cardiac development is an important step towards early identification of abnormalities during the developmental program and towards the creation of early intervention strategies. We developed a novel computational strategy for leveraging high-content data sets, including a large selection of microarray data associated with mouse cardiac development, mouse genome sequence, ChIP-seq data of selected mouse transcription factors and Y2H data of mouse protein-protein interactions, to infer the active transcriptional regulatory network of mouse cardiac development. We identified phase-specific expression activity for 765 overlapping gene co-expression modules that were defined for obtained cardiac lineage microarray data. For each co-expression module, we identified the phase of cardiac development where gene expression for that module was higher than other phases. Co-expression modules were found to be consistent with biological pathway knowledge in Wikipathways, and met expectations for enrichment of pathways involved in heart lineage development. Over 359,000 transcription factor-target relationships were inferred by analyzing the promoter sequences within each gene module for overrepresentation against the JASPAR database of Transcription Factor Binding Site (TFBS) motifs. The provisional regulatory network will provide a framework of studying the genetic basis of CHD. PMID:24421884

  8. Classification of Arabidopsis thaliana gene sequences: clustering of coding sequences into two groups according to codon usage improves gene prediction.

    PubMed

    Mathé, C; Peresetsky, A; Déhais, P; Van Montagu, M; Rouzé, P

    1999-02-01

    While genomic sequences are accumulating, finding the location of the genes remains a major issue that can be solved only for about a half of them by homology searches. Prediction methods are thus required, but unfortunately are not fully satisfying. Most prediction methods implicitly assume a unique model for genes. This is an oversimplification as demonstrated by the possibility to group coding sequences into several classes in Escherichia coli and other genomes. As no classification existed for Arabidopsis thaliana, we classified genes according to the statistical features of their coding sequences. A clustering algorithm using a codon usage model was developed and applied to coding sequences from A. thaliana, E. coli, and a mixture of both. By using it, Arabidopsis sequences were clustered into two classes. The CU1 and CU2 classes differed essentially by the choice of pyrimidine bases at the codon silent sites: CU2 genes often use C whereas CU1 genes prefer T. This classification discriminated the Arabidopsis genes according to their expressiveness, highly expressed genes being clustered in CU2 and genes expected to have a lower expression, such as the regulatory genes, in CU1. The algorithm separated the sequences of the Escherichia-Arabidopsis mixed data set into five classes according to the species, except for one class. This mixed class contained 89 % Arabidopsis genes from CU1 and 11 % E. coli genes, mostly horizontally transferred. Interestingly, most genes encoding organelle-targeted proteins, except the photosynthetic and photoassimilatory ones, were clustered in CU1. By tailoring the GeneMark CDS prediction algorithm to the observed coding sequence classes, its quality of prediction was greatly improved. Similar improvement can be expected with other prediction systems. PMID:9925779

  9. Generation of oscillating gene regulatory network motifs

    NASA Astrophysics Data System (ADS)

    van Dorp, M.; Lannoo, B.; Carlon, E.

    2013-07-01

    Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.

  10. Reverse engineering of gene regulatory networks.

    PubMed

    Cho, K H; Choo, S M; Jung, S H; Kim, J R; Choi, H S; Kim, J

    2007-05-01

    Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided. PMID:17591174

  11. The 5' regulatory sequence of the PMP22 in the patients with Charcot-Marie-Tooth disease.

    PubMed

    Sinkiewicz-Darol, Elena; Kabzińska, Dagmara; Moszyńska, Izabela; Kochański, Andrzej

    2010-01-01

    Little is known about the molecular background of clinical variability of Charcot-Marie-Tooth type 1A (CMT1A) disease and hereditary neuropathy with liability to pressure palsies (HNPP). The CMT1A and HNPP disorders result from duplication and deletion of the PMP22 gene respectively. In a series of studies performed on affected animal transgenic models of CMT1A disease, expression of the PMP22 gene (gene dosage) was shown to correlete with severity of CMT course (gene dosage effect). In this study we hypothesized that single nucleotide polymorphisms (SNPs) located within the 5' regulatory sequence of PMP22 gene may be responsible for the CMT1A/HNPP clinical variability. We have sequenced the PMP22 5' upstream regulatory sequence in a group of 45 CMT1A/HNPP patients harboring the PMP22 duplication (37) /deletion (8). We have identified five SNPs in the regulatory sequence of the PMP22 gene. Three of them i.e. -819C>T, -4785G>T, -4800C>T were detected both in the patients and in the control group. Thus, their pathogenic role in the regulation of the expression of the PMP22 gene seems not to be significant. Two SNPs i.e. -4210T>C and -4759T>A were found only in the CMT patients. Their role in the regulation of the PMP22 gene expression can not be excluded. Additionally we have detected the Thr118Met variant in exon 4 of the PMP22 gene, which was previously reported by other authors, in one patient. We conclude that the 5' regulatory sequence of the PMP22 gene is conserved at the nucleotiode level, however rarely occurring SNPs variant in the PMP22 regulatory sequence may be associated with the gene dosage effect. PMID:20842290

  12. [Identification and mapping of cis-regulatory elements within long genomic sequences].

    PubMed

    Akopov, S B; Chernov, I P; Vetchinova, A S; Bulanenkova, S S; Nikolaev, L G

    2007-01-01

    The publication of the human and other metazoan genome sequences opened up the possibility for mapping and analysis of genomic regulatory elements. Unfortunately, experimental data on genomic positions of such sequences as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. As most genomic regulatory elements (e.g., enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements in silico is often ambiguous. Therefore, the development of high-throughput experimental approaches for identification and mapping of genomic functional elements is highly desirable. In this review we discuss novel approaches to high-throughput experimental identification of mammalian genomes cis-regulatory elements which is a necessary step toward the complete genome annotation. PMID:18240562

  13. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  14. Gene Regulatory Networks Elucidating Huanglongbing Disease Mechanisms

    PubMed Central

    Martinelli, Federico; Reagan, Russell L.; Uratsu, Sandra L.; Phu, My L.; Albrecht, Ute; Zhao, Weixiang; Davis, Cristina E.; Bowman, Kim D.; Dandekar, Abhaya M.

    2013-01-01

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein – protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes) would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation), sucrose metabolism (upregulation), and starch biosynthesis (upregulation). In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70) was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur. PMID:24086326

  15. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence

    PubMed Central

    Kinney, Justin B.; Murugan, Anand; Callan, Curtis G.; Cox, Edward C.

    2010-01-01

    Cells use protein-DNA and protein-protein interactions to regulate transcription. A biophysical understanding of this process has, however, been limited by the lack of methods for quantitatively characterizing the interactions that occur at specific promoters and enhancers in living cells. Here we show how such biophysical information can be revealed by a simple experiment in which a library of partially mutated regulatory sequences are partitioned according to their in vivo transcriptional activities and then sequenced en masse. Computational analysis of the sequence data produced by this experiment can provide precise quantitative information about how the regulatory proteins at a specific arrangement of binding sites work together to regulate transcription. This ability to reliably extract precise information about regulatory biophysics in the face of experimental noise is made possible by a recently identified relationship between likelihood and mutual information. Applying our experimental and computational techniques to the Escherichia coli lac promoter, we demonstrate the ability to identify regulatory protein binding sites de novo, determine the sequence-dependent binding energy of the proteins that bind these sites, and, importantly, measure the in vivo interaction energy between RNA polymerase and a DNA-bound transcription factor. Our approach provides a generally applicable method for characterizing the biophysical basis of transcriptional regulation by a specified regulatory sequence. The principles of our method can also be applied to a wide range of other problems in molecular biology. PMID:20439748

  16. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar; Otim, Ochan; Brown, C. Titus; Livi, Carolina B.; Lee, Pei Yun; Revilla, Roger; Schilstra, Maria J.; Clarke, Peter J C.; Rust, Alistair G.; Pan, Zhengjun; Arnone, Maria I.; Rowen, Lee; Cameron, R. Andrew; McClay, David R.; Hood, Leroy; Bolouri, Hamid

    2002-01-01

    We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

  17. Genome-wide identification of conserved regulatory function in diverged sequences

    PubMed Central

    Taher, Leila; McGaughey, David M.; Maragh, Samantha; Aneas, Ivy; Bessling, Seneca L.; Miller, Webb; Nobrega, Marcelo A.; McCallion, Andrew S.; Ovcharenko, Ivan

    2011-01-01

    Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evolutionarily diverged loci by searching for conserved TFBS arrangements. With an estimated 5% false-positive rate (FPR) in approximately 3000 human and zebrafish syntenic loci, we detected approximately 300 pairs of diverged elements that are likely to share common ancestry and have similar regulatory activity. By analyzing a pool of experimentally validated human enhancers, we demonstrated that 7/8 (88%) of their predicted functional orthologs retained in vivo regulatory control. Moreover, in 5/7 (71%) of assayed enhancer pairs, we observed concordant expression patterns. We argue that TFBS composition is often necessary to retain and sufficient to predict regulatory function in the absence of overt sequence conservation, revealing an entire class of functionally conserved, evolutionarily diverged regulatory elements that we term “covert.” PMID:21628450

  18. Regulatory gene networks and the properties of the developmental process

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; McClay, David R.; Hood, Leroy

    2003-01-01

    Genomic instructions for development are encoded in arrays of regulatory DNA. These specify large networks of interactions among genes producing transcription factors and signaling components. The architecture of such networks both explains and predicts developmental phenomenology. Although network analysis is yet in its early stages, some fundamental commonalities are already emerging. Two such are the use of multigenic feedback loops to ensure the progressivity of developmental regulatory states and the prevalence of repressive regulatory interactions in spatial control processes. Gene regulatory networks make it possible to explain the process of development in causal terms and eventually will enable the redesign of developmental regulatory circuitry to achieve different outcomes.

  19. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA.

    PubMed

    Turner, Tychele N; Hormozdiari, Fereydoun; Duyzend, Michael H; McClymont, Sarah A; Hook, Paul W; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A; Zody, Michael C; Nelson, Bradley J; Huddleston, John; Sandstrom, Richard; Smith, Joshua D; Hanna, David; Swanson, James M; Faustman, Elaine M; Bamshad, Michael J; Stamatoyannopoulos, John; Nickerson, Deborah A; McCallion, Andrew S; Darnell, Robert; Eichler, Evan E

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  20. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  1. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

    PubMed Central

    Sîrbu, Alina; Crane, Martin; Ruskin, Heather J.

    2015-01-01

    Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  2. Conserved Noncoding Sequences Highlight Shared Components of Regulatory Networks in Dicotyledonous Plants[W

    PubMed Central

    Baxter, Laura; Jironkin, Aleksey; Hickman, Richard; Moore, Jay; Barrington, Christopher; Krusche, Peter; Dyer, Nigel P.; Buchanan-Wollaston, Vicky; Tiskin, Alexander; Beynon, Jim; Denby, Katherine; Ott, Sascha

    2012-01-01

    Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research. PMID:23110901

  3. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  4. Regulatory Elements of the Floral Homeotic Gene AGAMOUS Identified by Phylogenetic Footprinting and ShadowingW⃞

    PubMed Central

    Hong, Ray L.; Hamaguchi, Lynn; Busch, Maximilian A.; Weigel, Detlef

    2003-01-01

    In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3-kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae species, several other motifs, but not the LFY and WUS binding sites identified previously, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for the activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection but also demonstrate that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites. PMID:12782724

  5. Preservation of Gene Duplication Increases the Regulatory Spectrum of Ribosomal Protein Genes and Enhances Growth under Stress.

    PubMed

    Parenteau, Julie; Lavoie, Mathieu; Catala, Mathieu; Malik-Ghulam, Mustafa; Gagnon, Jules; Abou Elela, Sherif

    2015-12-22

    In baker's yeast, the majority of ribosomal protein genes (RPGs) are duplicated, and it was recently proposed that such duplications are preserved via the functional specialization of the duplicated genes. However, the origin and nature of duplicated RPGs' (dRPGs) functional specificity remain unclear. In this study, we show that differences in dRPG functions are generated by variations in the modality of gene expression and, to a lesser extent, by protein sequence. Analysis of the sequence and expression patterns of non-intron-containing RPGs indicates that each dRPG is controlled by specific regulatory sequences modulating its expression levels in response to changing growth conditions. Homogenization of dRPG sequences reduces cell tolerance to growth under stress without changing the number of expressed genes. Together, the data reveal a model where duplicated genes provide a means for modulating the expression of ribosomal proteins in response to stress. PMID:26686636

  6. Sequence and regulation of the porcine FSHR gene promoter.

    PubMed

    Wu, Wangjun; Han, Jing; Cao, Rui; Zhang, Jinbi; Li, Bojiang; Liu, Zequn; Liu, Kaiqing; Li, Qifa; Pan, Zengxiang; Chen, Jie; Liu, Honglin

    2015-03-01

    Follicle-stimulating hormone (FSH) plays a crucial role in animal reproduction and exerts its physiological functions by interacting with the FSH receptor (FSHR). The FSHR is exclusively expressed in granulose cells in the ovary and its expression level is closely related to granulose cell differentiation and follicle maturation. In mammal, most of the follicles undergo atresia, while follicle atresia is mainly caused by granulosa cell apoptosis. However, knowledge on the transcriptional regulatory mechanisms of the porcine FSHR gene in granulosa cell is still limited. In this study, approximately 2.1kb of the proximal promoter sequence of the porcine FSHR gene were obtained by genome walking, and the regulatory elements and transcription factors in the porcine FSHR promoter sequence were predicted. Furthermore, the core promoter region (-1195/-598) of the porcine FSHR gene was identified using a luciferase assay. Subsequently, the relationship between expression levels of the porcine FSHR gene and histone H3K9 acetylation levels around the core promoter region (-787/-572) in vivo and invitro were analyzed. Our results showed that an increased FSHR gene expression level was accompanied with an increase in histone H3K9 acetylation levels, suggesting that histone H3K9 acetylation could regulate the expression of the porcine FSHR gene. PMID:25599592

  7. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  8. Deduced products of C4-dicarboxylate transport regulatory genes of Rhizobium leguminosarum are homologous to nitrogen regulatory gene products.

    PubMed Central

    Ronson, C W; Astwood, P M; Nixon, B T; Ausubel, F M

    1987-01-01

    We have sequenced two genes dctB and dctD required for the activation of the C4-dicarboxylate transport structural gene dctA in free-living Rhizobium leguminosarum. The hydropathic profile of the dctB gene product (DctB) suggested that its N-terminal region may be located in the periplasm and its C-terminal region in the cytoplasm. The C-terminal region of DctB was strongly conserved with similar regions of the products of several regulatory genes that may act as environmental sensors, including ntrB, envZ, virA, phoR, cpxA, and phoM. The N-terminal domains of the products of several regulatory genes thought to be transcriptional activators, including ntrC, ompR, virG, phoB and sfrA. In addition, the central and C-terminal regions of DctD were strongly conserved with the products of ntrC and nifA, transcriptional activators that require the alternate sigma factor rpoN (ntrA) as co-activator. The central region of DctD also contained a potential ATP-binding domain. These results are consistent with recent results that show that rpoN product is required for dctA activation, and suggest that DctB plus DctD-mediated transcriptional activation of dctA may be mechanistically similar to NtrB plus NtrC-mediated activation of glnA in E. coli. PMID:3671068

  9. C. elegans Metabolic Gene Regulatory Networks Govern the Cellular Economy

    PubMed Central

    Watson, Emma; Walhout, Albertha J.M.

    2014-01-01

    Diet greatly impacts metabolism in health and disease. In response to the presence or absence of specific nutrients, metabolic gene regulatory networks sense the metabolic state of the cell and regulate metabolic flux accordingly, for instance by the transcriptional control of metabolic enzymes. Here we discuss recent insights regarding metazoan metabolic regulatory networks using the nematode Caenorhabditis elegans as a model, including the modular organization of metabolic gene regulatory networks, the prominent impact of diet on the transcriptome and metabolome, specialized roles of nuclear hormone receptors in responding to dietary conditions, regulation of metabolic genes and metabolic regulators by microRNAs, and feedback between metabolic genes and their regulators. PMID:24731597

  10. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  11. Genomic aberrations frequently alter chromatin regulatory genes in chordoma.

    PubMed

    Wang, Lu; Zehir, Ahmet; Nafa, Khedoudja; Zhou, Nengyi; Berger, Michael F; Casanova, Jacklyn; Sadowska, Justyna; Lu, Chao; Allis, C David; Gounder, Mrinal; Chandhanayingyong, Chandhanarat; Ladanyi, Marc; Boland, Patrick J; Hameed, Meera

    2016-07-01

    Chordoma is a rare primary bone neoplasm that is resistant to standard chemotherapies. Despite aggressive surgical management, local recurrence and metastasis is not uncommon. To identify the specific genetic aberrations that play key roles in chordoma pathogenesis, we utilized a genome-wide high-resolution SNP-array and next generation sequencing (NGS)-based molecular profiling platform to study 24 patient samples with typical histopathologic features of chordoma. Matching normal tissues were available for 16 samples. SNP-array analysis revealed nonrandom copy number losses across the genome, frequently involving 3, 9p, 1p, 14, 10, and 13. In contrast, copy number gain is uncommon in chordomas. Two minimum deleted regions were observed on 3p within a ∼8 Mb segment at 3p21.1-p21.31, which overlaps SETD2, BAP1 and PBRM1. The minimum deleted region on 9p was mapped to CDKN2A locus at 9p21.3, and homozygous deletion of CDKN2A was detected in 5/22 chordomas (∼23%). NGS-based molecular profiling demonstrated an extremely low level of mutation rate in chordomas, with an average of 0.5 mutations per sample for the 16 cases with matched normal. When the mutated genes were grouped based on molecular functions, many of the mutation events (∼40%) were found in chromatin regulatory genes. The combined copy number and mutation profiling revealed that SETD2 is the single gene affected most frequently in chordomas, either by deletion or by mutations. Our study demonstrated that chordoma belongs to the C-class (copy number changes) tumors whose oncogenic signature is non-random multiple copy number losses across the genome and genomic aberrations frequently alter chromatin regulatory genes. © 2016 Wiley Periodicals, Inc. PMID:27072194

  12. Intersecting transcription networks constrain gene regulatory evolution.

    PubMed

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-07-16

    Epistasis-the non-additive interactions between different genetic loci-constrains evolutionary pathways, blocking some and permitting others. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeast. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analysing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  13. Intersecting transcription networks constrain gene regulatory evolution

    PubMed Central

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-01-01

    Epistasis—the non-additive interactions between different genetic loci—constrains evolutionary pathways, blocking some and permitting others1–8. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeasts9. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analyzing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  14. The structure of the human peripherin gene (PRPH) and identification of potential regulatory elements

    SciTech Connect

    Foley, J.; Ley, C.A.; Parysek, L.M.

    1994-07-15

    The authors determined the complete nucleotide sequence of the coding region of the human peripherin gene (PRPH), as well as 742 bp 5{prime} to the cap site and 584 bp 3{prime} to the stop codon, and compared its structure and sequence to the rat and mouse genes. The overall structure of 9 exons separated by 8 introns is conserved among these three mammalian species. The nucleotide sequences of the human peripherin gene exons were 90% identical to the rat gene sequences, and the predicted human peripherin protein differed from rat peripherin at only 18 of 475 amino acid residues. Comparison of the 5{prime} flanking regions of the human peripherin gene and rodent genes revealed extensive areas of high homology. Additional conserved segments were found in introns 1 and 2. Within the 5{prime} region, potential regulatory sequences, including a nerve growth factor negative regulatory element, a Hox protein binding site, and a heat shock element, were identified in all peripherin genes. The positional conservation of each element suggests that they may be important in the tissue-specific, developmental-specific, and injury-specific expression of the peripherin gene. 24 refs., 2 figs., 1 tab.

  15. Gene Sequence Homology of Chemokines Across Species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-reactivities...

  16. GENE SEQUENCE HOMOLOGY OF CHEMOKINES ACROSS SPECIES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-react...

  17. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  18. Stress-induced endogenous siRNAs targeting regulatory intron sequences in Brachypodium

    PubMed Central

    Wang, Hsiao-Lin V.; Dinwiddie, Brandon L.; Lee, Herman

    2015-01-01

    Exposure to abiotic stresses triggers global changes in the expression of thousands of eukaryotic genes at the transcriptional and post-transcriptional levels. Small RNA (smRNA) pathways and splicing both function as crucial mechanisms regulating stress-responsive gene expression. However, examples of smRNAs regulating gene expression remain largely limited to effects on mRNA stability, translation, and epigenetic regulation. Also, our understanding of the networks controlling plant gene expression in response to environmental changes, and examples of these regulatory pathways intersecting, remains limited. Here, to investigate the role of smRNAs in stress responses we examined smRNA transcriptomes of Brachypodium distachyon plants subjected to various abiotic stresses. We found that exposure to different abiotic stresses specifically induced a group of novel, endogenous small interfering RNAs (stress-induced, UTR-derived siRNAs, or sutr-siRNAs) that originate from the 3′ UTRs of a subset of coding genes. Our bioinformatics analyses predicted that sutr-siRNAs have potential regulatory functions and that over 90% of sutr-siRNAs target intronic regions of many mRNAs in trans. Importantly, a subgroup of these sutr-siRNAs target the important intron regulatory regions, such as branch point sequences, that could affect splicing. Our study indicates that in Brachypodium, sutr-siRNAs may affect splicing by masking or changing accessibility of specific cis-elements through base-pairing interactions to mediate gene expression in response to stresses. We hypothesize that this mode of regulation of gene expression may also serve as a general mechanism for regulation of gene expression in plants and potentially in other eukaryotes. PMID:25480817

  19. Phenotype accessibility and noise in random threshold gene regulatory networks.

    PubMed

    Pinho, Ricardo; Garcia, Victor; Feldman, Marcus W

    2014-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  20. Phenotype Accessibility and Noise in Random Threshold Gene Regulatory Networks

    PubMed Central

    Feldman, Marcus W.

    2015-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  1. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations.

    PubMed

    Wang, Junbai; Batmanov, Kirill

    2015-12-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  2. Distinct Functional Constraints Partition Sequence Conservation in a cis-Regulatory Element

    PubMed Central

    Ruvinsky, Ilya

    2011-01-01

    Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. PMID:21655084

  3. Robustness and Accuracy in Sea Urchin Developmental Gene Regulatory Networks

    PubMed Central

    Ben-Tabou de-Leon, Smadar

    2016-01-01

    Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change. PMID:26913048

  4. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    PubMed Central

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2005-01-01

    We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085

  5. Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

    PubMed Central

    Noble, Luke M.; Andrianopoulos, Alex

    2013-01-01

    Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226

  6. DNA sequence of the yeast transketolase gene.

    PubMed

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  7. The nucleotide sequence of the mouse immunoglobulin epsilon gene: comparison with the human epsilon gene sequence.

    PubMed Central

    Ishida, N; Ueda, S; Hayashida, H; Miyata, T; Honjo, T

    1982-01-01

    We have determined the nucleotide sequence of the immunoglobulin epsilon gene cloned from newborn mouse DNA. The epsilon gene sequence allows prediction of the amino acid sequence of the constant region of the epsilon chain and comparison of it with sequences of the human epsilon and other mouse immunoglobulin genes. The epsilon gene was shown to be under the weakest selection pressure at the protein level among the immunoglobulin genes although the divergence at the synonymous position is similar. Our results suggest that the epsilon gene may be dispensable, which is in accord with the fact that IgE has only obscure roles in the immune defense system but has an undesirable role as a mediator of hypersensitivity. The sequence data suggest that the human and murine epsilon genes were derived from different ancestors duplicated a long time ago. The amino acid sequence of the epsilon chain is more homologous to those of the gamma chains than the other mouse heavy chains. Two membrane exons, separated by an 80-base intron, were identified 1.7 kb 3' to the CH4 domain of the epsilon gene and shown to conserve a hydrophobic portion similar to those of other heavy chain genes. RNA blot hybridization showed that the epsilon membrane exons are transcribed into two species of mRNA in an IgE hybridoma. Images Fig. 4. PMID:6329728

  8. The molecular and gene regulatory signature of a neuron

    PubMed Central

    Hobert, Oliver; Carrera, Inés; Stefanakis, Nikolaos

    2010-01-01

    Neuron-type specific gene batteries define the morphological and functional diversity of cell types in the nervous system. Here, we discuss the composition of neuron-type specific gene batteries and illustrate gene regulatory strategies employed by distinct organisms from C.elegans to higher vertebrates, which are instrumental in determining the unique gene expression profile and molecular composition of individual neuronal cell types. Based on principles learned from prokaryotic gene regulation, we argue that neuronal, terminal gene batteries are functionally grouped into parallel acting “regulons”. The theoretical concepts discussed here provide testable hypotheses for future experimental analysis into the exact gene regulatory mechanisms that are employed in the generation of neuronal diversity and identity. PMID:20663572

  9. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    SciTech Connect

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  10. Gene regulatory networks modelling using a dynamic evolutionary hybrid

    PubMed Central

    2010-01-01

    Background Inference of gene regulatory networks is a key goal in the quest for understanding fundamental cellular processes and revealing underlying relations among genes. With the availability of gene expression data, computational methods aiming at regulatory networks reconstruction are facing challenges posed by the data's high dimensionality, temporal dynamics or measurement noise. We propose an approach based on a novel multi-layer evolutionary trained neuro-fuzzy recurrent network (ENFRN) that is able to select potential regulators of target genes and describe their regulation type. Results The recurrent, self-organizing structure and evolutionary training of our network yield an optimized pool of regulatory relations, while its fuzzy nature avoids noise-related problems. Furthermore, we are able to assign scores for each regulation, highlighting the confidence in the retrieved relations. The approach was tested by applying it to several benchmark datasets of yeast, managing to acquire biologically validated relations among genes. Conclusions The results demonstrate the effectiveness of the ENFRN in retrieving biologically valid regulatory relations and providing meaningful insights for better understanding the dynamics of gene regulatory networks. The algorithms and methods described in this paper have been implemented in a Matlab toolbox and are available from: http://bioserver-1.bioacademy.gr/DataRepository/Project_ENFRN_GRN/. PMID:20298548

  11. Understanding the Role of Housekeeping and Stress-Related Genes in Transcription-Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Heath, Allison; Kavraki, Lydia; Balázsi, Gábor

    2008-03-01

    Despite the increasing number of completely sequenced genomes, much remains to be learned about how living cells process environmental information and respond to changes in their surroundings. Accumulating evidence indicates that eukaryotic and prokaryotic genes can be classified in two distinct categories that we will call class I and class II. Class I genes are housekeeping genes, often characterized by stable, noise resistant expression levels. In contrast, class II genes are stress-related genes and often have noisy, unstable expression levels. In this work we analyze the large scale transcription-regulatory networks (TRN) of E. coli and S. cerevisiae and preliminary data on H. sapien. We find that stable, housekeeping genes (class I) are preferentially utilized as transcriptional inputs while stress related, unstable genes (class II) are utilized as transcriptional integrators. This might be the result of convergent evolution that placed the appropriate genes in the appropriate locations within transcriptional networks according to some fundamental principles that govern cellular information processing.

  12. Time-Delayed Models of Gene Regulatory Networks

    PubMed Central

    Parmar, K.; Blyuss, K. B.; Kyrychko, Y. N.; Hogan, S. J.

    2015-01-01

    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems. PMID:26576197

  13. Bayesian Nonlinear Model Selection for Gene Regulatory Networks

    PubMed Central

    Ni, Yang; Stingo, Francesco C.; Baladandayuthapani, Veerabhadran

    2015-01-01

    Summary Gene regulatory networks represent the regulatory relationships between genes and their products and are important for exploring and defining the underlying biological processes of cellular systems. We develop a novel framework to recover the structure of nonlinear gene regulatory networks using semiparametric spline-based directed acyclic graphical models. Our use of splines allows the model to have both flexibility in capturing nonlinear dependencies as well as control of overfitting via shrinkage, using mixed model representations of penalized splines. We propose a novel discrete mixture prior on the smoothing parameter of the splines that allows for simultaneous selection of both linear and nonlinear functional relationships as well as inducing sparsity in the edge selection. Using simulation studies, we demonstrate the superior performance of our methods in comparison with several existing approaches in terms of network reconstruction and functional selection. We apply our methods to a gene expression dataset in glioblastoma multiforme, which reveals several interesting and biologically relevant nonlinear relationships. PMID:25854759

  14. Nemertean Toxin Genes Revealed through Transcriptome Sequencing

    PubMed Central

    Whelan, Nathan V.; Kocot, Kevin M.; Santos, Scott R.; Halanych, Kenneth M.

    2014-01-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63–74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  15. Nemertean toxin genes revealed through transcriptome sequencing.

    PubMed

    Whelan, Nathan V; Kocot, Kevin M; Santos, Scott R; Halanych, Kenneth M

    2014-12-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63-74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  16. Functional effects of a natural polymorphism in the transcriptional regulatory sequence of HLA-DQB1.

    PubMed Central

    Beaty, J S; West, K A; Nepom, G T

    1995-01-01

    DNA sequence polymorphism in the genes encoding HLA class II proteins accounts for allelic diversity in antigen recognition and presentation and, thus, in the role of these cell surface glycoproteins as determinants of the scope of the T-cell repertoire. In addition, sequence polymorphism in the promoter-proximal transcriptional regulatory regions of these genes has been described, particularly for the HLA-DQB1 locus, where these differences may contribute to variation in locus- and allele-specific expression. In this study, we measured the effect of such regulatory sequence polymorphism on the expression of endogenous alleles of DQB1 in heterozygous cells. Quantitative reverse transcriptase-mediated PCR analysis showed that expression of the DQB1*0301 allele responded more rapidly to gamma interferon induction than that of DQB1*0302. We have analyzed functional effects of a prominent allelic polymorphism that consists of a TG dinucleotide present between the W and X1 consensus elements in the DQB1*0302 allele but missing in the DQB1*0301 allele. The dominant effect of this polymorphism was to introduce a variation in the spacing between the W and X1 elements of these two alleles. A secondary compensatory effect was specific for the TG dinucleotide itself, which was essential for the binding of a nuclear protein complex to the *0302 regulatory region immediately 5' of the X1 element. Derivatives of the DQB1 5' regulatory region were used to drive expression of the chloramphenicol acetyltransferase gene in transient transfections of human B-lymphoblastoid and gamma interferon-treated melanoma cell lines, demonstrating that the additional spacing between the W and X1 elements caused by the presence of the TG dinucleotide in the *0302 allele resulted in reduced expression compared with that driven by the *0301 fragment; this difference overshadowed an up-regulating effect on expression which corresponded to the binding of the TG-dependent nuclear protein complex. The

  17. A multistep bioinformatic approach detects putative regulatory elements in gene promoters

    PubMed Central

    Bortoluzzi, Stefania; Coppe, Alessandro; Bisognin, Andrea; Pizzi, Cinzia; Danieli, Gian Antonio

    2005-01-01

    Background Searching for approximate patterns in large promoter sequences frequently produces an exceedingly high numbers of results. Our aim was to exploit biological knowledge for definition of a sheltered search space and of appropriate search parameters, in order to develop a method for identification of a tractable number of sequence motifs. Results Novel software (COOP) was developed for extraction of sequence motifs, based on clustering of exact or approximate patterns according to the frequency of their overlapping occurrences. Genomic sequences of 1 Kb upstream of 91 genes differentially expressed and/or encoding proteins with relevant function in adult human retina were analyzed. Methodology and results were tested by analysing 1,000 groups of putatively unrelated sequences, randomly selected among 17,156 human gene promoters. When applied to a sample of human promoters, the method identified 279 putative motifs frequently occurring in retina promoters sequences. Most of them are localized in the proximal portion of promoters, less variable in central region than in lateral regions and similar to known regulatory sequences. COOP software and reference manual are freely available upon request to the Authors. Conclusion The approach described in this paper seems effective for identifying a tractable number of sequence motifs with putative regulatory role. PMID:15904489

  18. Cotyledon nuclear proteins bind to DNA fragments harboring regulatory elements of phytohemagglutinin genes.

    PubMed Central

    Riggs, C D; Voelker, T A; Chrispeels, M J

    1989-01-01

    The effects of deleting DNA sequences upstream from the phytohemagglutinin-L gene of Phaseolus vulgaris have been examined with respect to the level of gene product produced in the seeds of transgenic tobacco. Our studies indicate that several upstream regions quantitatively modulate expression. Between -1000 and -675, a negative regulatory element reduces expression approximately threefold relative to shorter deletion mutants that do not contain this region. Positive regulatory elements lie between -550 and -125 and, compared with constructs containing only 125 base pairs of upstream sequences (-125), the presence of these two regions can be correlated with a 25-fold and a 200-fold enhancement of phytohemagglutinin-L levels. These experiments were complemented by gel retardation assays, which demonstrated that two of the three regions bind cotyledon nuclear proteins from mid-mature seeds. One of the binding sites maps near a DNA sequence that is highly homologous to protein binding domains located upstream from the soybean seed lectin and Kunitz trypsin inhibitor genes. Competition experiments demonstrated that the upstream regions of a bean beta-phaseolin gene, the soybean seed lectin gene, and an oligonucleotide from the upstream region of the trypsin inhibitor gene can compete differentially for factor binding. We suggest that these legume genes may be regulated in part by evolutionarily conserved protein/DNA interactions. PMID:2535513

  19. Highly recurring sequence elements identified in eukaryotic DNAs by computer analysis are often homologous to regulatory sequences or protein binding sites.

    PubMed Central

    Bodnar, J W; Ward, D C

    1987-01-01

    We have used computer assisted dot matrix and oligonucleotide frequency analyses to identify highly recurring sequence elements of 7-11 base pairs in eukaryotic genes and viral DNAs. Such elements are found much more frequently than expected, often with an average spacing of a few hundred base pairs. Furthermore, the most abundant repetitive elements observed in the ovalbumin locus, the beta-globin gene cluster, the metallothionein gene and the viral genomes of SV40, polyoma, Herpes simplex-1 and Mouse Mammary Tumor Virus were sequences shown previously to be protein binding sites or sequences important for regulating gene expression. These sequences were present in both exons and introns as well as promoter regions. These observations suggest that such sequences are often highly overrepresented within the specific gene segments with which they are associated. Computer analysis of other genetic units, including viral genomes and oncogenes, has identified a number of highly recurring sequence elements that could serve similar regulatory or protein-binding functions. A model for the role of such reiterated sequence elements in DNA organization and function is presented. PMID:3822840

  20. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  1. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  2. Regulatory links between imprinted genes: evolutionary predictions and consequences

    PubMed Central

    Patten, Manus M.; Cowley, Michael; Oakey, Rebecca J.; Feil, Robert

    2016-01-01

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species. PMID:26842569

  3. Full-length minor ampullate spidroin gene sequence.

    PubMed

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  4. Full-Length Minor Ampullate Spidroin Gene Sequence

    PubMed Central

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  5. The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart

    PubMed Central

    Li, Xing; Thiagarajan, Raghuram; Nelson, Timothy J.; Tomita-Mitchell, Aoy; Beard, Daniel A.

    2014-01-01

    Cardiac development is a complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. Approximately 200 genes of interest were input into the algorithm to generate putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of putative networks are merged and filtered to generate scale-free, hierarchical networks that are statistically significant and biologically relevant. The networks are validated with known gene interactions and used to predict regulatory pathways important for the developing mammalian heart. Area under the precision-recall curve and receiver operator characteristic curve are 9% and 58%, respectively. Of the top 10 ranked predicted interactions, 4 have already been validated. The algorithm is further tested using a network enriched with known interactions and another depleted of them. The inferred networks contained more interactions for the enriched network versus the depleted network. In all test cases, maximum performance of the algorithm was achieved when the purely data-driven method of network inference was combined with a data-independent, functional-based association method. Lastly, the network generated from the list of approximately 200 genes of interest was expanded using gene-profile uniqueness metrics to include approximately 900 additional known mouse genes and to form the most likely cardiogenic gene regulatory network. The resultant network supports known regulatory interactions and contains several novel cardiogenic regulatory interactions. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation. PMID:24971943

  6. Gene regulatory networks elucidating Huanglongbing disease mechanisms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional...

  7. Gene Regulatory Evolution During Speciation in a Songbird.

    PubMed

    Davidson, John H; Balakrishnan, Christopher N

    2016-01-01

    Over the last decade, tremendous progress has been made toward a comparative understanding of gene regulatory evolution. However, we know little about how gene regulation evolves in birds, and how divergent genomes interact in their hybrids. Because of the unique features of birds - female heterogamety, a highly conserved karyotype, and the slow evolution of reproductive incompatibilities - an understanding of regulatory evolution in birds is critical to a comprehensive understanding of regulatory evolution and its implications for speciation. Using a novel complement of analyses of replicated RNA-seq libraries, we demonstrate abundant divergence in brain gene expression between zebra finch (Taeniopygia guttata) subspecies. By comparing parental populations and their F1 hybrids, we also show that gene misexpression is relatively rare among brain-expressed transcripts in male birds. If this pattern is consistent across tissues and sexes, it may partially explain the slow buildup of postzygotic reproductive isolation observed in birds relative to other taxa. Although we expected that the action of genetic drift on the island-dwelling zebra finch subspecies would be manifested in a higher rate of trans regulatory divergence, we found that most divergence was in cis regulation, following a pattern commonly observed in other taxa. Thus, our study highlights both unique and shared features of avian regulatory evolution. PMID:26976438

  8. Gene Regulatory Evolution During Speciation in a Songbird

    PubMed Central

    Davidson, John H.; Balakrishnan, Christopher N.

    2016-01-01

    Over the last decade, tremendous progress has been made toward a comparative understanding of gene regulatory evolution. However, we know little about how gene regulation evolves in birds, and how divergent genomes interact in their hybrids. Because of the unique features of birds – female heterogamety, a highly conserved karyotype, and the slow evolution of reproductive incompatibilities – an understanding of regulatory evolution in birds is critical to a comprehensive understanding of regulatory evolution and its implications for speciation. Using a novel complement of analyses of replicated RNA-seq libraries, we demonstrate abundant divergence in brain gene expression between zebra finch (Taeniopygia guttata) subspecies. By comparing parental populations and their F1 hybrids, we also show that gene misexpression is relatively rare among brain-expressed transcripts in male birds. If this pattern is consistent across tissues and sexes, it may partially explain the slow buildup of postzygotic reproductive isolation observed in birds relative to other taxa. Although we expected that the action of genetic drift on the island-dwelling zebra finch subspecies would be manifested in a higher rate of trans regulatory divergence, we found that most divergence was in cis regulation, following a pattern commonly observed in other taxa. Thus, our study highlights both unique and shared features of avian regulatory evolution. PMID:26976438

  9. Pl-Bh, an Anthocyanin Regulatory Gene of Maize That Leads to Variegated Pigmentation

    PubMed Central

    Cocciolone, S. M.; Cone, K. C.

    1993-01-01

    Anthocyanins are purple pigments that can be produced in virtually all parts of the maize plant. The spatial distribution of anthocyanin synthesis is dictated by the organ-specific expression of a few regulatory genes that control the transcription of the structural genes. The regulatory genes are grouped into families based on functional identity and DNA sequence similarity. The C1/Pl gene family consists of C1, which controls pigmentation of the kernel, and Pl, which controls pigmentation of the vegetative and floral organs. We have determined the relationship of another gene, Blotched (Bh), to the C1 gene family. Bh was originally described as a gene that conditions blotches of pigmentation in kernels homozygous for recessive c1, suggesting that Bh could functionally replace C1 in the kernel. Our genetic and molecular analyses indicate that Bh is an allele of Pl, that we designate Pl-Bh. Pl-Bh differs from wild-type Pl alleles in two respects. In contrast to the uniform pigmentation observed in plants carrying Pl, the pattern of pigmentation in plants carrying Pl-Bh is variegated. Pl-Bh leads to variegated pigmentation in virtually all tissues of the plant, including the kernel, an organ not pigmented by other Pl alleles. To address the molecular basis for the unusual pattern of expression of Pl-Bh, we cloned and sequenced the gene. The nucleotide sequence of Pl-Bh showed only a single base-pair difference from that of Pl. However, genomic DNA sequences associated with Pl-Bh were found to be hypermethylated relative to the same sequences around the wild-type Pl allele. The methylation was inversely correlated with Pl mRNA levels in variegated plant tissues. Thus, we conclude that DNA methylation may play a role in regulating Pl-Bh expression. PMID:7694886

  10. Functional Evolution of cis-Regulatory Modules at a Homeotic Gene in Drosophila

    PubMed Central

    Schiller, Benjamin J.; Bae, Esther; Tran, Diana A.; Shur, Andrey S.; Allen, John M.; Rau, Christoph; Bender, Welcome; Fisher, William W.; Celniker, Susan E.; Drewell, Robert A.

    2009-01-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules. PMID:19893611

  11. Gene Regulatory Networks Activated during Chronic Tuberculosis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chronic tuberculosis represents a burden for most of world’s population. Several genes were found to be up-regulated at the late stage of chronic tuberculosis when DNA microarray protocol was used to analyze murine tuberculosis. Rv0348 is a potential transcriptional regulator that is highly expresse...

  12. Cloning, sequencing, and expression of bacteriophage BF23 late genes 24 and 25 encoding tail proteins.

    PubMed Central

    Nakayama, S; Kaneko, T; Ishimaru, H; Moriwaki, H; Mizobuchi, K

    1994-01-01

    Two bacteriophage BF23 late genes, genes 24 and 25, were isolated on a 7.4-kb PstI fragment from the phage DNA, and their nucleotide sequences were determined. Gene 24 encodes a minor tail protein with the expected M(r) of 34,309, and gene 25 located 4 bp upstream of gene 24 encodes a major tail protein with the expected M(r) of 50,329. When total cellular RNA isolated from either phage-infected cells or cells bearing the cloned genes was analyzed by the primer extension method using the primers specific to either gene 25 or gene 24, we identified a possible late gene promoter, designated P25, in the 5'-flanking region of gene 25. This promoter was similar in structure to Escherichia coli promoters for sigma 70. Studies of the translational gene 25- and gene 24-lacZ fusions in the cloned gene system revealed that the promoter P25 was responsible for the expression of both genes 25 and 24 even in the absence of the regulatory genes which were absolutely required for late gene expression in the normal phage-infected cells. These results indicate that the two genes constitute an operon under the control of P25 and that the regulatory gene products of BF23 do not participate directly in specifying the late gene promoter. Images PMID:7961500

  13. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design. PMID:26833485

  14. The structure and function of the regulatory elements of the Escherichia coli uvrB gene.

    PubMed Central

    van den Berg, E; Zwetsloot, J; Noordermeer, I; Pannekoek, H; Dekker, B; Dijkema, R; van Ormondt, H

    1981-01-01

    The construction and properties of recombinant plasmids carrying the Escherichia coli uvrB gene, including its transcriptional- and translational regulatory elements, is reported. The DNA sequence of the region, which governs the expression of the uvrB gene, has been determined. Within this sequence two non-overlapping DNA segments match the model sequence for Escherichia coli promoters (1). The '-10 regions' and the '-35 regions' of the proposed uvrB promoters are, respectively, 5'TAAAAT (P1), 5'TATAAT (P2) and 5'TTGGCA (P1), 5'GTGATG (P2). The existence and the position of these promoters has been established by elimination of one promoter (P2), using molecular cloning procedures, by length measurements of in vitro synthesized 'run-off' transcripts and by protection of the uvrB regulatory region for S1 nuclease digestion using in vivo made RNA. Potential sites of interaction within the uvrB regulatory region with regulatory proteins, such as the LexA protein (2) and the UvrC protein (3) are discussed. Images PMID:6273801

  15. Identification and characterization of the afsR homologue regulatory gene from Streptomyces peucetius ATCC 27952.

    PubMed

    Parajuli, Niranjan; Viet, Hung Trinh; Ishida, Kenji; Tong, Hang Thi; Lee, Hei Chan; Liou, Kwangkyoung; Sohng, Jae Kyung

    2005-01-01

    We have isolated an afsR homologue, called afsR-p, through genome analysis of Streptomyces peucetius ATCC 27952. AfsR-p shares 60% sequence identity with AfsR from Streptomyces coelicolor A3 (2). afsR-p was expressed under the control of the ermE* promoter in its hosts S. peucetius, Streptomyces lividans TK 24, Streptomyces clavuligerus and Streptomyces griseus. We observed overproduction of doxorubicin (4-fold) in S. peucetius, gamma-actinorhodin (2.6-fold) in S. lividans, clavulanic acid (1.5-fold) in S. clavuligerus and streptomycin (slight) in S. griseus. Overproduction was due to expression of the gene in these strains as compared to the wild-type strains harboring the vector only. Comparative study of the expression of afsR-p revealed that regulatory networking in Streptomyces is not uniform. We speculate that phosphorylated AfsR-p becomes bound to the promoter region of afsS. The latter activates other regulatory genes, including pathway regulatory genes, and induces the production of secondary metabolites including antibiotics. We identified specific conserved amino acids and exploited them for the isolation of the partial sequence of the afsR homologue from S. clavuligerus and Streptomyces achromogens (rubradirin producer). Such findings provide additional evidence for the presence of a serine/threonine and tyrosine kinase-dependent global regulatory network in Streptomyces. PMID:15921897

  16. A Genome-Wide Regulatory Framework Identifies Maize Pericarp Color1 Controlled Genes[C][W

    PubMed Central

    Morohashi, Kengo; Casas, María Isabel; Ferreyra, Lorena Falcone; Mejía-Guerra, María Katherine; Pourcel, Lucille; Yilmaz, Alper; Feller, Antje; Carvalho, Bruna; Emiliani, Julia; Rodriguez, Eduardo; Pellegrinet, Silvina; McMullen, Michael; Casati, Paula; Grotewold, Erich

    2012-01-01

    Pericarp Color1 (P1) encodes an R2R3-MYB transcription factor responsible for the accumulation of insecticidal flavones in maize (Zea mays) silks and red phlobaphene pigments in pericarps and other floral tissues, which makes P1 an important visual marker. Using genome-wide expression analyses (RNA sequencing) in pericarps and silks of plants with contrasting P1 alleles combined with chromatin immunoprecipitation coupled with high-throughput sequencing, we show here that the regulatory functions of P1 are much broader than the activation of genes corresponding to enzymes in a branch of flavonoid biosynthesis. P1 modulates the expression of several thousand genes, and ∼1500 of them were identified as putative direct targets of P1. Among them, we identified F2H1, corresponding to a P450 enzyme that converts naringenin into 2-hydroxynaringenin, a key branch point in the P1-controlled pathway and the first step in the formation of insecticidal C-glycosyl flavones. Unexpectedly, the binding of P1 to gene regulatory regions can result in both gene activation and repression. Our results indicate that P1 is the major regulator for a set of genes involved in flavonoid biosynthesis and a minor modulator of the expression of a much larger gene set that includes genes involved in primary metabolism and production of other specialized compounds. PMID:22822204

  17. Asymmetric Regulation of Peripheral Genes by Two Transcriptional Regulatory Networks

    PubMed Central

    Li, Jing-Ru; Suzuki, Takahiro; Nishimura, Hajime; Kishima, Mami; Maeda, Shiori; Suzuki, Harukazu

    2016-01-01

    Transcriptional regulatory network (TRN) reconstitution and deconstruction occur simultaneously during reprogramming; however, it remains unclear how the starting and targeting TRNs regulate the induction and suppression of peripheral genes. Here we analyzed the regulation using direct cell reprogramming from human dermal fibroblasts to monocytes as the platform. We simultaneously deconstructed fibroblastic TRN and reconstituted monocytic TRN; monocytic and fibroblastic gene expression were analyzed in comparison with that of fibroblastic TRN deconstruction only or monocytic TRN reconstitution only. Global gene expression analysis showed cross-regulation of TRNs. Detailed analysis revealed that knocking down fibroblastic TRN positively affected half of the upregulated monocytic genes, indicating that intrinsic fibroblastic TRN interfered with the expression of induced genes. In contrast, reconstitution of monocytic TRN showed neutral effects on the majority of fibroblastic gene downregulation. This study provides an explicit example that demonstrates how two networks together regulate gene expression during cell reprogramming processes and contributes to the elaborate exploration of TRNs. PMID:27483142

  18. Timing of flagellar gene expression in the Caulobacter cell cycle is determined by a transcriptional cascade of positive regulatory genes.

    PubMed Central

    Ohta, N; Chen, L S; Mullin, D A; Newton, A

    1991-01-01

    The Caulobacter crescentus flagellar (fla) genes are organized in a regulatory hierarchy in which genes at each level are required for expression of those at the next lower level. To determine the role of this hierarchy in the timing of fla gene expression, we have examined the organization and cell cycle regulation of genes located in the hook gene cluster. As shown here, this cluster is organized into four multicistronic transcription units flaN, flbG, flaO, and flbF that contain fla genes plus a fifth transcription unit II.1 of unknown function. Transcription unit II.1 is regulated independently of the fla gene hierarchy, and it is expressed with a unique pattern of periodicity very late in the cell cycle. The flaN, flbG, and flaO operons are all transcribed periodically, and flaO, which is near the top of the hierarchy and required in trans for the activation of flaN and flbG operons, is expressed earlier in the cell cycle than the other two transcription units. We have shown that delaying flaO transcription by fusing it to the II.1 promoter also delayed the subsequent expression of the flbG operon and the 27- and 25-kDa flagellin genes that are at the bottom of the regulatory hierarchy. Thus, the sequence and timing of fla gene expression in the cell cycle are determined in large measure by the positions of these genes in the regulatory hierarchy. These results also suggest that periodic transcription is a general feature of fla gene expression in C. crescentus. Images PMID:1847367

  19. The Transcriptional and Gene Regulatory Network of Lactococcus lactis MG1363 during Growth in Milk

    PubMed Central

    de Jong, Anne; Hansen, Morten E.; Kuipers, Oscar P.; Kilstrup, Mogens; Kok, Jan

    2013-01-01

    In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs) are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks. PMID:23349698

  20. Multicolor labeling in developmental gene regulatory network analysis.

    PubMed

    Sethi, Aditya J; Angerer, Robert C; Angerer, Lynne M

    2014-01-01

    The sea urchin embryo is an important model system for developmental gene regulatory network (GRN) analysis. This chapter describes the use of multicolor fluorescent in situ hybridization (FISH) as well as a combination of FISH and immunohistochemistry in sea urchin embryonic GRN studies. The methods presented here can be applied to a variety of experimental settings where accurate spatial resolution of multiple gene products is required for constructing a developmental GRN. PMID:24567220

  1. A gene regulatory network armature for T-lymphocyte specification

    SciTech Connect

    Fung, Elizabeth-sharon

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  2. Efficient experimental design for uncertainty reduction in gene regulatory networks

    PubMed Central

    2015-01-01

    Background An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. Results The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Conclusions Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/. PMID:26423515

  3. Function does not follow form in gene regulatory circuits

    PubMed Central

    Payne, Joshua L.; Wagner, Andreas

    2015-01-01

    Gene regulatory circuits are to the cell what arithmetic logic units are to the chip: fundamental components of information processing that map an input onto an output. Gene regulatory circuits come in many different forms, distinct structural configurations that determine who regulates whom. Studies that have focused on the gene expression patterns (functions) of circuits with a given structure (form) have examined just a few structures or gene expression patterns. Here, we use a computational model to exhaustively characterize the gene expression patterns of nearly 17 million three-gene circuits in order to systematically explore the relationship between circuit form and function. Three main conclusions emerge. First, function does not follow form. A circuit of any one structure can have between twelve and nearly thirty thousand distinct gene expression patterns. Second, and conversely, form does not follow function. Most gene expression patterns can be realized by more than one circuit structure. And third, multifunctionality severely constrains circuit form. The number of circuit structures able to drive multiple gene expression patterns decreases rapidly with the number of these patterns. These results indicate that it is generally not possible to infer circuit function from circuit form, or vice versa. PMID:26290154

  4. Charting gene regulatory networks: strategies, challenges and perspectives

    PubMed Central

    2004-01-01

    One of the foremost challenges in the post-genomic era will be to chart the gene regulatory networks of cells, including aspects such as genome annotation, identification of cis-regulatory elements and transcription factors, information on protein–DNA and protein–protein interactions, and data mining and integration. Some of these broad sets of data have already been assembled for building networks of gene regulation. Even though these datasets are still far from comprehensive, and the approach faces many important and difficult challenges, some strategies have begun to make connections between disparate regulatory events and to foster new hypotheses. In this article we review several different genomics and proteomics technologies, and present bioinformatics methods for exploring these data in order to make novel discoveries. PMID:15080794

  5. Molecular characterization of a maize regulatory gene

    SciTech Connect

    Wessler, S.R.

    1991-12-01

    Based on initial bombardment studies we have previously concluded that promoter diversity was responsible for the diversity of naturally occurring R alleles. During this period we have found that R is controlled at the level of translation initiation and intron 1 is alternatively spliced. The experiments described in Sections 1 and 2 sought to quantify these effects and to determine whether they contribute to the tissue specific expression of select R alleles. This study was done because very little is understood about the post-transcriptional regulation of plant genes. Section 3 and 4 describe experiments designed to identify important structural components of the R protein.

  6. Compartmentalized gene regulatory network of the pathogenic fungus Fusarium graminearum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Head blight caused by Fusarium graminearum (Fg) is a major limiting factor of wheat production with both yield loss and mycotoxin contamination. Here we report a model for global Fg gene regulatory networks (GRNs) inferred from a large collection of transcriptomic data using a machine-learning appro...

  7. Data- and knowledge-based modeling of gene regulatory networks: an update

    PubMed Central

    Linde, Jörg; Schulze, Sylvie; Henkel, Sebastian G.; Guthke, Reinhard

    2015-01-01

    Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions. PMID:27047314

  8. Regulatory hotspots are associated with plant gene expression under varying soil phosphorus supply in Brassica rapa.

    PubMed

    Hammond, John P; Mayes, Sean; Bowen, Helen C; Graham, Neil S; Hayden, Rory M; Love, Christopher G; Spracklen, William P; Wang, Jun; Welham, Sue J; White, Philip J; King, Graham J; Broadley, Martin R

    2011-07-01

    Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have defined cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identification of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efficiency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efficiency identified through eQTL analysis are potential targets for further characterization and may have potential for crop improvement. PMID:21527424

  9. Inferring slowly-changing dynamic gene-regulatory networks

    PubMed Central

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with ℓ1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset. PMID:25917062

  10. Epidermal differentiation gene regulatory networks controlled by MAF and MAFB.

    PubMed

    Labott, Andrew T; Lopez-Pajares, Vanessa

    2016-06-01

    Numerous regulatory factors in epidermal differentiation and their role in regulating different cell states have been identified in recent years. However, the genetic interactions between these regulators over the dynamic course of differentiation have not been studied. In this Extra-View article, we review recent work by Lopez-Pajares et al. that explores a new regulatory network in epidermal differentiation. They analyze the changing transcriptome throughout epidermal regeneration to identify 3 separate gene sets enriched in the progenitor, early and late differentiation states. Using expression module mapping, MAF along with MAFB, are identified as transcription factors essential for epidermal differentiation. Through double knock-down of MAF:MAFB using siRNA and CRISPR/Cas9-mediated knockout, epidermal differentiation was shown to be impaired both in-vitro and in-vivo, confirming MAF:MAFB's role to activate genes that drive differentiation. Lopez-Pajares and collaborators integrated 42 published regulator gene sets and the MAF:MAFB gene set into the dynamic differentiation gene expression landscape and found that lncRNAs TINCR and ANCR act as upstream regulators of MAF:MAFB. Furthermore, ChIP-seq analysis of MAF:MAFB identified key transcription factor genes linked to epidermal differentiation as downstream effectors. Combined, these findings illustrate a dynamically regulated network with MAF:MAFB as a crucial link for progenitor gene repression and differentiation gene activation. PMID:27097296

  11. Implicit methods for qualitative modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Mohanram, Kartik; De Micheli, Giovanni; Xenarios, Ioannis

    2012-01-01

    Advancements in high-throughput technologies to measure increasingly complex biological phenomena at the genomic level are rapidly changing the face of biological research from the single-gene single-protein experimental approach to studying the behavior of a gene in the context of the entire genome (and proteome). This shift in research methodologies has resulted in a new field of network biology that deals with modeling cellular behavior in terms of network structures such as signaling pathways and gene regulatory networks. In these networks, different biological entities such as genes, proteins, and metabolites interact with each other, giving rise to a dynamical system. Even though there exists a mature field of dynamical systems theory to model such network structures, some technical challenges are unique to biology such as the inability to measure precise kinetic information on gene-gene or gene-protein interactions and the need to model increasingly large networks comprising thousands of nodes. These challenges have renewed interest in developing new computational techniques for modeling complex biological systems. This chapter presents a modeling framework based on Boolean algebra and finite-state machines that are reminiscent of the approach used for digital circuit synthesis and simulation in the field of very-large-scale integration (VLSI). The proposed formalism enables a common mathematical framework to develop computational techniques for modeling different aspects of the regulatory networks such as steady-state behavior, stochasticity, and gene perturbation experiments. PMID:21938638

  12. Establishing the Architecture of Plant Gene Regulatory Networks.

    PubMed

    Yang, F; Ouma, W Z; Li, W; Doseff, A I; Grotewold, E

    2016-01-01

    Gene regulatory grids (GRGs) encompass the space of all the possible transcription factor (TF)-target gene interactions that regulate gene expression, with gene regulatory networks (GRNs) representing a temporal and spatial manifestation of a portion of the GRG, essential for the specification of gene expression. Thus, understanding GRG architecture provides a valuable tool to explain how genes are expressed in an organism, an important aspect of synthetic biology and essential toward the development of the "in silico" cell. Progress has been made in some unicellular model systems (eg, yeast), but significant challenges remain in more complex multicellular organisms such as plants. Key to understanding the organization of GRGs is therefore identifying the genes that TFs bind to, and control. The application of sensitive and high-throughput methods to investigate genome-wide TF-target gene interactions is providing a wealth of information that can be linked to important agronomic traits. We describe here the methods and resources that have been developed to investigate the architecture of plant GRGs and GRNs. We also provide information regarding where to obtain clones or other resources necessary for synthetic biology or metabolic engineering. PMID:27480690

  13. Short DNA sequences inserted for gene targeting can accidentally interfere with off-target gene expression.

    PubMed

    Meier, Ingo D; Bernreuther, Christian; Tilling, Thomas; Neidhardt, John; Wong, Yong Wee; Schulze, Christian; Streichert, Thomas; Schachner, Melitta

    2010-06-01

    Targeting of genes in mice, a key approach to study development and disease, often leaves a neo cassette, loxP, or FRT sites inserted in the mouse genome. Insertion of neo can influence the expression of neighboring genes, but similar effects have not been reported for loxP sites. We therefore performed microarray analyses of mice in which the Ncam or the Tnr gene were targeted either by insertion of neo or loxP/FRT sites. In the case of Ncam, neo, but not loxP/FRT insertion, led to a 2-fold reduction in mRNA levels of 3 genes located at distances between 0.2 and 3.1 Mb from the target. In contrast, after introduction of loxP/FRT sites into introns of Tnr, we observed a 2.5- to 4-fold reduction in the transcript level of the Gas5 gene, 1.1 Mb away from Tnr, most probably due to disruption of a conserved regulatory element in Tnr. Insertion of short DNA sequences such as loxP/FRT can thus influence off-target mRNA levels if these sites are accidentally placed into regulatory elements. Our results imply that conditional knockout mice should be analyzed for genomic positional side effects that may influence the animals' phenotypes. PMID:20110269

  14. Boosting heterologous protein production in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Scheffer, Stanley; Jacobs, Anni; Zambre, Mukund; Zobell, Oliver; Goossens, Alain; Depicker, Ann; Angenon, Geert

    2002-12-01

    Over the past decade, several high value proteins have been produced in different transgenic plant tissues such as leaves, tubers, and seeds. Despite recent advances, many heterologous proteins accumulate to low concentrations, and the optimization of expression cassettes to make in planta production and purification economically feasible remains critical. Here, the regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) of common bean (Phaseolus vulgaris) were evaluated for producing heterologous proteins in dicotyledonous seeds. The murine single chain variable fragment (scFv) G4 (ref. 4) was chosen as model protein because of the current industrial interest in producing antibodies and derived fragments in crops. In transgenic Arabidopsis thaliana seed stocks, the scFv under control of the 35S promoter of the cauliflower mosaic virus (CaMV) accumulated to approximately 1% of total soluble protein (TSP). However, a set of seed storage promoter constructs boosted the scFv accumulation to exceptionally high concentrations, reaching no less than 36.5% of TSP in homozygous seeds. Even at these high concentrations, the scFv proteins had antigen-binding activity and affinity similar to those produced in Escherichia coli. The feasibility of heterologous protein production under control of arc5-I regulatory sequences was also demonstrated in Phaseolus acutifolius, a promising crop for large scale production. PMID:12415287

  15. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  16. Modularity and evolutionary constraints in a baculovirus gene regulatory network

    PubMed Central

    2013-01-01

    Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates

  17. Rhodobase, a meta-analytical tool for reconstructing gene regulatory networks in a model photosynthetic bacterium.

    PubMed

    Moskvin, Oleg V; Bolotin, Dmitry; Wang, Andrew; Ivanov, Pavel S; Gomelsky, Mark

    2011-02-01

    We present Rhodobase, a web-based meta-analytical tool for analysis of transcriptional regulation in a model anoxygenic photosynthetic bacterium, Rhodobacter sphaeroides. The gene association meta-analysis is based on the pooled data from 100 of R. sphaeroides whole-genome DNA microarrays. Gene-centric regulatory networks were visualized using the StarNet approach (Jupiter, D.C., VanBuren, V., 2008. A visual data mining tool that facilitates reconstruction of transcription regulatory networks. PLoS ONE 3, e1717) with several modifications. We developed a means to identify and visualize operons and superoperons. We designed a framework for the cross-genome search for transcription factor binding sites that takes into account high GC-content and oligonucleotide usage profile characteristic of the R. sphaeroides genome. To facilitate reconstruction of directional relationships between co-regulated genes, we screened upstream sequences (-400 to +20bp from start codons) of all genes for putative binding sites of bacterial transcription factors using a self-optimizing search method developed here. To test performance of the meta-analysis tools and transcription factor site predictions, we reconstructed selected nodes of the R. sphaeroides transcription factor-centric regulatory matrix. The test revealed regulatory relationships that correlate well with the experimentally derived data. The database of transcriptional profile correlations, the network visualization engine and the optimized search engine for transcription factor binding sites analysis are available at http://rhodobase.org. PMID:21070832

  18. RNA Sequencing of Mouse Sinoatrial Node Reveals an Upstream Regulatory Role for Islet-1 in Cardiac Pacemaker Cells

    PubMed Central

    Vedantham, Vasanth; Galang, Giselle; Evangelista, Melissa; Deo, Rahul C.; Srivastava, Deepak

    2015-01-01

    Rationale Treatment of sinus node disease with regenerative or cell-based therapies will require a detailed understanding of gene regulatory networks in cardiac pacemaker cells (PCs). Objective To characterize the transcriptome of PCs using RNA sequencing, and to identify transcriptional networks responsible for PC gene expression. Methods and Results We used laser capture micro-dissection (LCM) on a sinus node reporter mouse line to isolate RNA from PCs for RNA sequencing (RNA-Seq). Differential expression and network analysis identified novel SAN-enriched genes, and predicted that the transcription factor Islet-1 (Isl1) is active in developing pacemaker cells. RNA-Seq on SAN tissue lacking Isl1 established that Isl1 is an important transcriptional regulator within the developing SAN. Conclusions (1) The PC transcriptome diverges sharply from other cardiomyocytes; (2) Isl1 is a positive transcriptional regulator of the PC gene expression program. PMID:25623957

  19. Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant.

    PubMed

    Cone, K C; Cocciolone, S M; Burr, F A; Burr, B

    1993-12-01

    Genetic studies in maize have identified several regulatory genes that control the tissue-specific synthesis of purple anthocyanin pigments in the plant. c1 regulates pigmentation in the aleurone layer of the kernel, whereas pigmentation in the vegetative and floral tissues of the plant body depends on pl. c1 encodes a protein with the structural features of eukaryotic transcription factors and functions to control the accumulation of transcripts for the anthocyanin biosynthetic genes. Previous genetic and molecular observations have prompted the hypothesis that c1 and pl are functionally duplicate, in that they control the same set of anthocyanin structural genes but in distinct parts of the plant. Here, we show that this proposed functional similarity is reflected by DNA sequence homology between c1 and pl. Using a c1 DNA fragment as a hybridization probe, genomic and cDNA clones for pl were isolated. Comparison of pl and c1 cDNA sequences revealed that the genes encode proteins with 90% or more amino acid identity in the amino- and carboxyl-terminal domains that are known to be important for the regulatory function of the C1 protein. Consistent with the idea that the pl gene product also acts as a transcriptional activator is our finding that a functional pl allele is required for the transcription of at least three structural genes in the anthocyanin biosynthetic pathway. PMID:8305872

  20. Gap Gene Regulatory Dynamics Evolve along a Genotype Network

    PubMed Central

    Crombach, Anton; Wotton, Karl R.; Jiménez-Guri, Eva; Jaeger, Johannes

    2016-01-01

    Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as “system drift.” System drift is illustrated by the gap gene network—involved in segmental patterning—in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of “genotype networks” and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability). PMID:26796549

  1. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation.

    PubMed

    Goode, Debbie K; Obier, Nadine; Vijayabaskar, M S; Lie-A-Ling, Michael; Lilly, Andrew J; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-03-01

    Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  2. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation

    PubMed Central

    Goode, Debbie K.; Obier, Nadine; Vijayabaskar, M.S.; Lie-A-Ling, Michael; Lilly, Andrew J.; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A.; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R.; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-01-01

    Summary Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  3. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  4. Motif for controllable toggle switch in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhao, Chen; Bin, Ao; Ye, Weiming; Fan, Ying; Di, Zengru

    2015-02-01

    Toggle switch as a common phenomenon in gene regulatory networks has been recognized important for biological functions. Despite much effort dedicated to understanding the toggle switch and designing synthetic biology circuit to achieve the biological function, we still lack a comprehensive understanding of the intrinsic dynamics behind such phenomenon and the minimum structure that is imperative for producing toggle switch. In this paper, we discover a minimum structure, a motif that enables a controllable toggle switch. In particular, the motif consists of a transformative double negative feedback loop (DNFL) that is regulated by an additional driver node. By enumerating all possible regulatory configurations from the driver node, we identify two types of motifs associated with the toggle switch that is captured by the existence of bistable states. The toggle switch is controllable in the sense that the gap between the bistable states is adjustable as determined by the regulatory strength from the driver nodes. We test the effect of the motifs in self-oscillating gene regulatory network (SON) with respect to the interplay between the motifs and the other genes, and find that the switching dynamics of the whole network can be successfully controlled insofar as the network contains a single motif. Our findings are important to uncover the underlying nonlinear dynamics of controllable toggle switch and can have implications in devising biology circuit in the field of synthetic biology.

  5. Mapping gene regulatory circuitry of Pax6 during neurogenesis

    PubMed Central

    Thakurela, Sudhir; Tiwari, Neha; Schick, Sandra; Garding, Angela; Ivanek, Robert; Berninger, Benedikt; Tiwari, Vijay K

    2016-01-01

    Pax6 is a highly conserved transcription factor among vertebrates and is important in various aspects of the central nervous system development. However, the gene regulatory circuitry of Pax6 underlying these functions remains elusive. We find that Pax6 targets a large number of promoters in neural progenitors cells. Intriguingly, many of these sites are also bound by another progenitor factor, Sox2, which cooperates with Pax6 in gene regulation. A combinatorial analysis of Pax6-binding data set with transcriptome changes in Pax6-deficient neural progenitors reveals a dual role for Pax6, in which it activates the neuronal (ectodermal) genes while concurrently represses the mesodermal and endodermal genes, thereby ensuring the unidirectionality of lineage commitment towards neuronal differentiation. Furthermore, Pax6 is critical for inducing activity of transcription factors that elicit neurogenesis and repress others that promote non-neuronal lineages. In addition to many established downstream effectors, Pax6 directly binds and activates a number of genes that are specifically expressed in neural progenitors but have not been previously implicated in neurogenesis. The in utero knockdown of one such gene, Ift74, during brain development impairs polarity and migration of newborn neurons. These findings demonstrate new aspects of the gene regulatory circuitry of Pax6, revealing how it functions to control neuronal development at multiple levels to ensure unidirectionality and proper execution of the neurogenic program. PMID:27462442

  6. Topological origin of global attractors in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhang, YunJun; Ouyang, Qi; Geng, Zhi

    2015-02-01

    Fixed-point attractors with global stability manifest themselves in a number of gene regulatory networks. This property indicates the stability of regulatory networks against small state perturbations and is closely related to other complex dynamics. In this paper, we aim to reveal the core modules in regulatory networks that determine their global attractors and the relationship between these core modules and other motifs. This work has been done via three steps. Firstly, inspired by the signal transmission in the regulation process, we extract the model of chain-like network from regulation networks. We propose a module of "ideal transmission chain (ITC)", which is proved sufficient and necessary (under certain condition) to form a global fixed-point in the context of chain-like network. Secondly, by examining two well-studied regulatory networks (i.e., the cell-cycle regulatory networks of Budding yeast and Fission yeast), we identify the ideal modules in true regulation networks and demonstrate that the modules have a superior contribution to network stability (quantified by the relative size of the biggest attraction basin). Thirdly, in these two regulation networks, we find that the double negative feedback loops, which are the key motifs of forming bistability in regulation, are connected to these core modules with high network stability. These results have shed new light on the connection between the topological feature and the dynamic property of regulatory networks.

  7. Repressive BMP2 gene regulatory elements near the BMP2 promoter

    SciTech Connect

    Jiang, Shan; Chandler, Ronald L.; Fritz, David T.; Mortlock, Douglas P.; Rogers, Melissa B.

    2010-02-05

    The level of bone morphogenetic protein 2 (BMP2) profoundly influences essential cell behaviors such as proliferation, differentiation, apoptosis, and migration. The spatial and temporal pattern of BMP2 synthesis, particular in diverse embryonic cells, is highly varied and dynamic. We have identified GC-rich sequences within the BMP2 promoter region that strongly repress gene expression. These elements block the activity of a highly conserved, osteoblast enhancer in response to FGF2 treatment. Both positive and negative gene regulatory elements control BMP2 synthesis. Detecting and mapping the repressive motifs is essential because they impede the identification of developmentally regulated enhancers necessary for normal BMP2 patterns and concentration.

  8. Gene structure, regulatory control, and evolution of black widow venom latrotoxins

    PubMed Central

    Bhere, Kanaka Varun; Haney, Robert A.; Ayoub, Nadia A.; Garb, Jessica E.

    2014-01-01

    Black widow venom contains α-latrotoxin, infamous for causing intense pain. Combining 33 kb of Latrodectus hesperus genomic DNA with RNA-Seq, we characterized the α-latrotoxin gene and discovered a paralog, 4.5 kb downstream. Both paralogs exhibit venom gland specific transcription, and may be regulated post-transcriptionally via musashi-like proteins. A 4 kb intron interrupts the α-latrotoxin coding sequence, while a 10 kb intron in the 3′ UTR of the paralog may cause nonsense-mediated decay. Phylogenetic analysis confirms these divergent latrotoxins diversified through recent tandem gene duplications. Thus, latrotoxin genes have more complex structures, regulatory controls, and sequence diversity than previously proposed. PMID:25217831

  9. EXAMINE: a computational approach to reconstructing gene regulatory networks.

    PubMed

    Deng, Xutao; Geng, Huimin; Ali, Hesham

    2005-08-01

    Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of linear models has remained unclear. We address these problems by developing an improved method, EXpression Array MINing Engine (EXAMINE), to infer gene regulatory networks from time-series gene expression data sets. EXAMINE takes advantage of sparse graph theory to overcome the excessive-parameter problem with an adaptive-connectivity model and fitting algorithm. EXAMINE also guarantees that the most parsimonious network structure will be found with its incremental adaptive fitting process. Compared to previous linear models, where a fully connected model is used, EXAMINE reduces the number of parameters by O(N), thereby increasing the chance of recovering the underlying regulatory network. The fitting algorithm increments the connectivity during the fitting process until a satisfactory fit is obtained. We performed a systematic study to explore the data mining ability of linear models. A guideline for using linear models is provided: If the system is small (3-20 elements), more than 90% of the regulation pathways can be determined correctly. For a large-scale system, either clustering is needed or it is necessary to integrate information in addition to expression profile. Coupled with the clustering method, we applied EXAMINE to rat central nervous system development (CNS) data with 112 genes. We were able to efficiently generate regulatory networks with statistically significant pathways that have been predicted previously. PMID:15951103

  10. Genome-Wide Identification of Regulatory Elements and Reconstruction of Gene Regulatory Networks of the Green Alga Chlamydomonas reinhardtii under Carbon Deprivation

    PubMed Central

    Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

    2013-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can

  11. Gene therapy for cancer: regulatory considerations for approval

    PubMed Central

    Husain, S R; Han, J; Au, P; Shannon, K; Puri, R K

    2015-01-01

    The rapidly changing field of gene therapy promises a number of innovative treatments for cancer patients. Advances in genetic modification of cancer and immune cells and the use of oncolytic viruses and bacteria have led to numerous clinical trials for cancer therapy, with several progressing to late-stage product development. At the time of this writing, no gene therapy product has been approved by the United States Food and Drug Administration (FDA). Some of the key scientific and regulatory issues include understanding of gene transfer vector biology, safety of vectors in vitro and in animal models, optimum gene transfer, long-term persistence or integration in the host, shedding of a virus and ability to maintain transgene expression in vivo for a desired period of time. Because of the biological complexity of these products, the FDA encourages a flexible, data-driven approach for preclinical safety testing programs. The clinical trial design should be based on the unique features of gene therapy products, and should ensure the safety of enrolled subjects. This article focuses on regulatory considerations for gene therapy product development and also discusses guidance documents that have been published by the FDA. PMID:26584531

  12. Gene expression in maturing neurons: regulatory mechanisms and related neurodevelopmental disorders.

    PubMed

    Ding, Baojin

    2015-04-25

    During the central nervous system (CNS) development, the interactions between intrinsic genes and extrinsic environment ensure that each neuronal developmental stage (eg. neuronal proliferation, differentiation, migration, axon extension, dendritogenesis and formation of functional synapses) occurs in the proper timing and sequence. The successful coordination requires that numerous groups of genes are exquisitely regulated in a spatiotemporal manner by various regulatory mechanisms, including sequence-specific DNA-binding proteins, histone modifications, DNA methylation, chromatin remodeling, and microRNAs (miRNAs). By targeting chromatin structure, transcription and translation processes, these mechanisms form a regulatory network to accomplish the fine regulation of gene expression in response to environmental stimuli at different developmental stages. Dysregulation of the gene expression during neuronal development has been shown to be implicated in a number of neurodevelopmental disorders, such as autism spectrum disorders (ASD), Rett syndrome (RTT), Fragile-X syndrome (FXS) and other genetic diseases. The further understanding of the regulation of gene expression during neuronal development may provide new approaches for the diagnosis and treatment of these disorders. PMID:25896042

  13. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  14. Using gene expression programming to infer gene regulatory networks from time-series data.

    PubMed

    Zhang, Yongqing; Pu, Yifei; Zhang, Haisen; Su, Yabo; Zhang, Lifang; Zhou, Jiliu

    2013-12-01

    Gene regulatory networks inference is currently a topic under heavy research in the systems biology field. In this paper, gene regulatory networks are inferred via evolutionary model based on time-series microarray data. A non-linear differential equation model is adopted. Gene expression programming (GEP) is applied to identify the structure of the model and least mean square (LMS) is used to optimize the parameters in ordinary differential equations (ODEs). The proposed work has been first verified by synthetic data with noise-free and noisy time-series data, respectively, and then its effectiveness is confirmed by three real time-series expression datasets. Finally, a gene regulatory network was constructed with 12 Yeast genes. Experimental results demonstrate that our model can improve the prediction accuracy of microarray time-series data effectively. PMID:24140883

  15. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  16. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  17. From System-Wide Differential Gene Expression to Perturbed Regulatory Factors: A Combinatorial Approach

    PubMed Central

    Mahajan, Gaurang; Mande, Shekhar C.

    2015-01-01

    High-throughput experiments such as microarrays and deep sequencing provide large scale information on the pattern of gene expression, which undergoes extensive remodeling as the cell dynamically responds to varying environmental cues or has its function disrupted under pathological conditions. An important initial step in the systematic analysis and interpretation of genome-scale expression alteration involves identification of a set of perturbed transcriptional regulators whose differential activity can provide a proximate hypothesis to account for these transcriptomic changes. In the present work, we propose an unbiased and logically natural approach to transcription factor enrichment. It involves overlaying a list of experimentally determined differentially expressed genes on a background regulatory network coming from e.g. literature curation or computational motif scanning, and identifying that subset of regulators whose aggregated target set best discriminates between the altered and the unaffected genes. In other words, our methodology entails testing of all possible regulatory subnetworks, rather than just the target sets of individual regulators as is followed in most standard approaches. We have proposed an iterative search method to efficiently find such a combination, and benchmarked it on E. coli microarray and regulatory network data available in the public domain. Comparative analysis carried out on artificially generated differential expression profiles, as well as empirical factor overexpression data for M. tuberculosis, shows that our methodology provides marked improvement in accuracy of regulatory inference relative to the standard method that involves evaluating factor enrichment in an individual manner. PMID:26562430

  18. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    PubMed Central

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-01-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution. PMID:26184971

  19. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    NASA Astrophysics Data System (ADS)

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-07-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution.

  20. Identification of C4 photosynthesis metabolism and regulatory-associated genes in Eleocharis vivipara by SSH.

    PubMed

    Chen, Taiyu; Ye, Rongjian; Fan, Xiaolei; Li, Xianghua; Lin, Yongjun

    2011-09-01

    This is the first effort to investigate the candidate genes involved in kranz developmental regulation and C(4) metabolic fluxes in Eleocharis vivipara, which is a leafless freshwater amphibious plant and possesses a distinct culms anatomy structure and photosynthetic pattern in contrasting environments. A terrestrial specific SSH library was constructed to investigate the genes involved in kranz anatomy developmental regulation and C(4) metabolic fluxes. A total of 73 ESTs and 56 unigenes in 384 clones were identified by array hybridization and sequencing. In total, 50 unigenes had homologous genes in the databases of rice and Arabidopsis. The real-time quantitative PCR results showed that most of the genes were accumulated in terrestrial culms and ABA-induced culms. The C(4) marker genes were stably accumulated during the culms development process in terrestrial culms. With respect to C(3) culms, C(4) photosynthesis metabolism consumed much more transporters and translocators related to ion metabolism, organic acids and carbohydrate metabolism, phosphate metabolism, amino acids metabolism, and lipids metabolism. Additionally, ten regulatory genes including five transcription factors, four receptor-like proteins, and one BURP protein were identified. These regulatory genes, which co-accumulated with the culms developmental stages, may play important roles in culms structure developmental regulation, bundle sheath chloroplast maturation, and environmental response. These results shed new light on the C(4) metabolic fluxes, environmental response, and anatomy structure developmental regulation in E. vivipara. PMID:21739352

  1. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes

    PubMed Central

    Yao, Lijing; Berman, Benjamin P.; Farnham, Peggy J.

    2015-01-01

    Abstract Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer–target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers. PMID:26446758

  2. Roles of lignin biosynthesis and regulatory genes in plant development.

    PubMed

    Yoon, Jinmi; Choi, Heebak; An, Gynheung

    2015-11-01

    Lignin is an important factor affecting agricultural traits, biofuel production, and the pulping industry. Most lignin biosynthesis genes and their regulatory genes are expressed mainly in the vascular bundles of stems and leaves, preferentially in tissues undergoing lignification. Other genes are poorly expressed during normal stages of development, but are strongly induced by abiotic or biotic stresses. Some are expressed in non-lignifying tissues such as the shoot apical meristem. Alterations in lignin levels affect plant development. Suppression of lignin biosynthesis genes causes abnormal phenotypes such as collapsed xylem, bending stems, and growth retardation. The loss of expression by genes that function early in the lignin biosynthesis pathway results in more severe developmental phenotypes when compared with plants that have mutations in later genes. Defective lignin deposition is also associated with phenotypes of seed shattering or brittle culm. MYB and NAC transcriptional factors function as switches, and some homeobox proteins negatively control lignin biosynthesis genes. Ectopic deposition caused by overexpression of lignin biosynthesis genes or master switch genes induces curly leaf formation and dwarfism. PMID:26297385

  3. C DNA SEQUENCE OF CHANNEL CATFISH PEROXIREDOXIN 6 GENE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Peroxiredoxin 6 gene (Prdx6) of channel catfish, Ictalurus punctatus, was cloned and sequenced. Total RNA from channel catfish tissues was isolated, reverse transcribed and amplified. The sequence of the channel catfish Prdx6 gene consists of 1003 nucleotides. Analysis of the nucleotide sequence ...

  4. Functional Studies of Regulatory Genes in the Sea Urchin Embryo

    NASA Astrophysics Data System (ADS)

    Cavalieri, Vincenzo; Bernardo, Maria Di; Spinelli, Giovanni

    Sea urchin embryos are characterized by an extremely simple mode of development, rapid cleavage, high transparency, and well-defined cell lineage. Although they are not suitable for genetic studies, other approaches are successfully used to unravel mechanisms and molecules involved in cell fate specification and morphogenesis. Microinjection is the elective method to study gene function in sea urchin embryos. It is used to deliver precise amounts of DNA, RNA, oligonucleotides, peptides, or antibodies into the eggs or even into blastomeres. Here we describe microinjection as it is currently applied in our laboratory and show how it has been used in gene perturbation analyses and dissection of cis-regulatory DNA elements.

  5. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  6. An Arabidopsis Gene Regulatory Network for Secondary Cell Wall Synthesis

    PubMed Central

    Taylor-Teeples, M; Lin, L; de Lucas, M; Turco, G; Toal, TW; Gaudinier, A; Young, NF; Trabucco, GM; Veling, MT; Lamothe, R; Handakumbura, PP; Xiong, G; Wang, C; Corwin, J; Tsoukalas, A; Zhang, L; Ware, D; Pauly, M; Kliebenstein, DJ; Dehesh, K; Tagkopoulos, I; Breton, G; Pruneda-Paz, JL; Ahnert, SE; Kay, SA; Hazen, SP; Brady, SM

    2014-01-01

    Summary The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component. PMID:25533953

  7. Duplication of floral regulatory genes in the Lamiales.

    PubMed

    Aagaard, Jan E; Olmstead, Richard G; Willis, John H; Phillips, Patrick C

    2005-08-01

    Duplication of some floral regulatory genes has occurred repeatedly in angiosperms, whereas others are thought to be single-copy in most lineages. We selected three genes that interact in a pathway regulating floral development conserved among higher tricolpates (LFY/FLO, UFO/FIM, and AP3/DEF) and screened for copy number among families of Lamiales that are closely related to the model species Antirrhinum majus. We show that two of three genes have duplicated at least twice in the Lamiales. Phylogenetic analyses of paralogs suggest that an ancient whole genome duplication shared among many families of Lamiales occurred after the ancestor of these families diverged from the lineage leading to Veronicaceae (including the single-copy species A. majus). Duplication is consistent with previous patterns among angiosperm lineages for AP3/DEF, but this is the first report of functional duplicate copies of LFY/FLO outside of tetraploid species. We propose Lamiales taxa will be good models for understanding mechanisms of duplicate gene preservation and how floral regulatory genes may contribute to morphological diversity. PMID:21646149

  8. Transcriptional Targeting in the Airway Using Novel Gene Regulatory Elements

    PubMed Central

    Burnight, Erin R.; Wang, Guoshun; McCray, Paul B.

    2012-01-01

    The delivery of cystic fibrosis transmembrane conductance regulator (CFTR) to airway epithelia is a goal of many gene therapy strategies to treat cystic fibrosis. Because the native regulatory elements of the CFTR are not well characterized, the development of vectors with heterologous promoters of varying strengths and specificity would aid in our selection of optimal reagents for the appropriate expression of the vector-delivered CFTR gene. Here we contrasted the performance of several novel gene-regulatory elements. Based on airway expression analysis, we selected putative regulatory elements from BPIFA1 and WDR65 to investigate. In addition, we selected a human CFTR promoter region (∼ 2 kb upstream of the human CFTR transcription start site) to study. Using feline immunodeficiency virus vectors containing the candidate elements driving firefly luciferase, we transduced murine nasal epithelia in vivo. Luciferase expression persisted for 30 weeks, which was the duration of the experiment. Furthermore, when the nasal epithelium was ablated using the detergent polidocanol, the mice showed a transient loss of luciferase expression that returned 2 weeks after administration, suggesting that our vectors transduced a progenitor cell population. Importantly, the hWDR65 element drove sufficient CFTR expression to correct the anion transport defect in CFTR-null epithelia. These results will guide the development of optimal vectors for sufficient, sustained CFTR expression in airway epithelia. PMID:22447971

  9. Phase transitions in the evolution of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Skanata, Antun; Kussell, Edo

    The role of gene regulatory networks is to respond to environmental conditions and optimize growth of the cell. A typical example is found in bacteria, where metabolic genes are activated in response to nutrient availability, and are subsequently turned off to conserve energy when their specific substrates are depleted. However, in fluctuating environmental conditions, regulatory networks could experience strong evolutionary pressures not only to turn the right genes on and off, but also to respond optimally under a wide spectrum of fluctuation timescales. The outcome of evolution is predicted by the long-term growth rate, which differentiates between optimal strategies. Here we present an analytic computation of the long-term growth rate in randomly fluctuating environments, by using mean-field and higher order expansion in the environmental history. We find that optimal strategies correspond to distinct regions in the phase space of fluctuations, separated by first and second order phase transitions. The statistics of environmental randomness are shown to dictate the possible evolutionary modes, which either change the structure of the regulatory network abruptly, or gradually modify and tune the interactions between its components.

  10. Exceptionally high heterologous protein levels in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Angenon, Geert; Depicker, Ann

    2003-01-01

    Seeds are concentrated sources of protein and thus may be ideal 'bioreactors' for the production of heterologous proteins. For this application, strong seed-specific expression signals are required. A set of expression cassettes were designed using 5' and 3' regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) from Phaseolus vulgaris, and evaluated for the production of heterologous proteins in dicotyledonous plant species. A murine single-chain variable fragment (scFv) was chosen as model protein because of the current industrial interest to produce antibodies and derived fragments in crops. Because the highest scFv accumulation in seed had previously been achieved in the endoplasmic reticulum (ER), the scFv-encoding sequence was provided with signal sequences for accumulation in the ER. Transgenic Arabidopsis seed stocks, expressing the scFv under control of the 35S promoter, contained scFv accumulation levels in the range of 1% of total soluble protein (TSP). However, the seed storage promoter constructs boosted the scFv to exceptionally high levels. Maximum scFv levels were obtained in homozygous seed stocks, being 12.5% of TSP under control of the arc5-I regulatory sequences and even up to 36.5% of TSP upon replacing the arc5-I promoter by the beta-phaseolin promoter of Phaseolus vulgaris. Even at such very high levels, the scFv proteins retain their full antigen-binding activity. Moreover, the presence of very high scFv levels has only minory effects on seed germination and no effect on seed production. These results demonstrate that the expression levels of arcelin 5-I and beta-phaseolin seed storage protein genes can be transferred to heterologous proteins, giving exceptionally high levels of heterologous proteins, which can be of great value for the molecular farming industry by raising production yield and lowering bio-mass production and purification costs. Finally, the feasibility of heterologous protein production using the

  11. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  12. Genome-wide analysis reveals regulatory role of G4 DNA in gene transcription

    PubMed Central

    Du, Zhuo; Zhao, Yiqiang; Li, Ning

    2008-01-01

    G-quadruplex or G4 DNA, a four-stranded DNA structure formed in G-rich sequences, has been hypothesized to be a structural motif involved in gene regulation. In this study, we examined the regulatory role of potential G4 DNA motifs (PG4Ms) located in the putative transcriptional regulatory region (TRR, –500 to +500) of genes across the human genome. We found that PG4Ms in the 500-bp region downstream of the annotated transcription start site (TSS; PG4MD500) are associated with gene expression. Generally, PG4MD500-positive genes are expressed at higher levels than PG4MD500-negative genes, and an increased number of PG4MD500 provides a cumulative effect. This observation was validated by controlling for attributes, including gene family, function, and promoter similarity. We also observed an asymmetric pattern of PG4MD500 distribution between strands, whereby the frequency of PG4MD500 in the coding strand is generally higher than that in the template strand. Further analysis showed that the presence of PG4MD500 and its strand asymmetry are associated with significant enrichment of RNAP II at the putative TRR. On the basis of these results, we propose a model of G4 DNA-mediated stimulation of transcription with the hypothesis that PG4MD500 contributes to gene transcription by maintaining the DNA in an open conformation, while the asymmetric distribution of PG4MD500 considerably reduces the probability of blocking the progression of the RNA polymerase complex on the template strand. Our findings provide a comprehensive view of the regulatory function of G4 DNA in gene transcription. PMID:18096746

  13. GSEL version 2, an online genome-wide query system of operon organization and regulatory sequence elements of Geobacter sulfurreducens.

    PubMed

    Qu, Yanhua; Brown, Peter; Barbe, Jose F; Puljic, Marko; Merino, Enrique; Adkins, Ronald M; Lovley, Derek R; Krushkal, Julia

    2009-10-01

    Geobacter sulfurreducens is a model organism within the delta-Proteobacterial family Geobacteraceae, members of which can participate in environmental bioremediation of metal and organic waste contaminants and in production of bioenergy. In this report, we describe a new, significantly expanded and updated, version 2 of the GSEL (Geobacter Sequence Elements) database ( http://geobacter.org/research/gsel2/ and http://geobacter.org/refs/gsel2/ ) and its accompanying online query system, which compiles information on operon organization and regulatory sequence elements in the genome of G. sulfurreducens. It incorporates a new online graphical browser, provides novel search capabilities, and includes updated operon predictions along with new information on predicted and experimentally validated genome regulatory sites. The GSEL database and online search system provides a unique and comprehensive tool cataloging information about gene regulation in G. sulfurreducens, aiding in investigation of mechanisms that regulate its ability to generate electric power, bioremediate environmental waste, and adapt to environmental changes. PMID:19792871

  14. Regulatory elements responsible for inducible expression of the granulocyte colony-stimulating factor gene in macrophages.

    PubMed Central

    Nishizawa, M; Nagata, S

    1990-01-01

    Granulocyte colony-stimulating factor (G-CSF) plays an essential role in granulopoiesis during bacterial infection. Macrophages produce G-CSF in response to bacterial endotoxins such as lipopolysaccharide (LPS). To elucidate the mechanism of the induction of G-CSF gene in macrophages or macrophage-monocytes, we have examined regulatory cis elements in the promoter of mouse G-CSF gene. Analyses of linker-scanning and internal deletion mutants of the G-CSF promoter by the chloramphenicol acetyltransferase assay have indicated that at least three regulatory elements are indispensable for the LPS-induced expression of the G-CSF gene in macrophages. When one of the three elements was reiterated and placed upstream of the TATA box of the G-CSF promoter, it mediated inducibility as a tissue-specific and orientation-independent enhancer. Although this element contains a conserved NF-kappa B-like binding site, the gel retardation assay and DNA footprint analysis with nuclear extracts from macrophage cell lines demonstrated that nuclear proteins bind to the DNA sequence downstream of the NF-kappa B-like element, but not to the conserved element itself. The DNA sequence of the binding site was found to have some similarities to the LPS-responsive element which was recently identified in the promoter of the mouse class II major histocompatibility gene. Images PMID:1691438

  15. Organisation of regulatory elements in two closely spaced Drosophila genes with common expression characteristics.

    PubMed

    Gigliotti, S; Balz, V; Malva, C; Schäfer, M A

    1997-11-01

    Sperm tail proteins that are components of a specific structure formed late during spermatid elongation have been found to be encoded by the Mst(3)CGP gene family. These genes have been demonstrated to be regulated both at the transcriptional as well as at the translational level. We report here on the dissection of the regulatory regions for two members of the gene family, Mst84Da and Mst84Db. While high level transcription and negative translational control of Mst84Da is mediated by a short gene segment of 205 nt (-152/+53), Mst84Db expression is controlled by a number of distinct regulatory elements with different effects that all reside within the gene itself. We identify a transcriptional control element between +154 and +216, a translational repression element around +216 to +275 and an RNA stability element within the 3'UTR. Irrespective of the final common expression characteristics, correct regulation for any individual member of the gene family seems to be achieved by very different means. This confirms earlier observations that did not detect any other sequence elements in common apart from the TCE (translational control element). PMID:9431808

  16. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    PubMed

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  17. Gene and translation initiation site prediction in metagenomic sequences

    SciTech Connect

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John; Uberbacher, Edward C

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  18. Identification of genes in genomic and EST sequences

    SciTech Connect

    Fields, C.; Adams, M.D.; Kerlavage, A.R.; Dubnick, M.; McCombie, W.R.; Martin-Gallardo, A.; Venter, J.C.; White, O.

    1993-12-31

    Currently-available software tools are capable of predicting the locations of most protein-coding genes in anonymous genomic DNA sequences. The use of predicted exxon to select primers for PCR amplification from cDNA libraries allows the complete structures of novel genes to be determined efficiently. As the number of expressed sequence tag (EST) sequences increases, the fraction of genes that can be localized in genomic sequences by searching EST databases will rapidly approach unity. The challenge for automated DNA sequence analysis is now to develop methods for accurately predicting gene structure and alternative splicing patterns. Substantially improving current accuracies in gene structure prediction will require retrospective comparative analysis of sequences from different organisms and gene families.

  19. Partitioning of genetic variation between regulatory and coding gene segments: the predominance of software variation in genes encoding introvert proteins.

    PubMed

    Mitchison, A

    1997-01-01

    In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics. PMID:9148788

  20. Brain-specific genes have identifier sequences in their introns.

    PubMed Central

    Milner, R J; Bloom, F E; Lai, C; Lerner, R A; Sutcliffe, J G

    1984-01-01

    The 82-nucleotide identifier (ID) sequence is present in the rat genome in 1-1.5 X 10(5) copies and in cDNA clones of precursors of brain-specific mRNAs. One brain-specific gene contains more than one ID sequence in its introns. There is an excess of ID sequences to brain genes, and some ID sequences appear to have been inserted as mobile elements into other genetic locations. Therefore, brain genes contain ID sequences in their introns, but not all ID sequences are located in brain gene introns. A brain ID consensus sequence has been obtained by comparing 8 ID nucleotide sequences. Images PMID:6583673

  1. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    2002-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  2. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    1999-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  3. Predictive modelling of gene expression from transcriptional regulatory elements.

    PubMed

    Budden, David M; Hurley, Daniel G; Crampin, Edmund J

    2015-07-01

    Predictive modelling of gene expression provides a powerful framework for exploring the regulatory logic underpinning transcriptional regulation. Recent studies have demonstrated the utility of such models in identifying dysregulation of gene and miRNA expression associated with abnormal patterns of transcription factor (TF) binding or nucleosomal histone modifications (HMs). Despite the growing popularity of such approaches, a comparative review of the various modelling algorithms and feature extraction methods is lacking. We define and compare three methods of quantifying pairwise gene-TF/HM interactions and discuss their suitability for integrating the heterogeneous chromatin immunoprecipitation (ChIP)-seq binding patterns exhibited by TFs and HMs. We then construct log-linear and ϵ-support vector regression models from various mouse embryonic stem cell (mESC) and human lymphoblastoid (GM12878) data sets, considering both ChIP-seq- and position weight matrix- (PWM)-derived in silico TF-binding. The two algorithms are evaluated both in terms of their modelling prediction accuracy and ability to identify the established regulatory roles of individual TFs and HMs. Our results demonstrate that TF-binding and HMs are highly predictive of gene expression as measured by mRNA transcript abundance, irrespective of algorithm or cell type selection and considering both ChIP-seq and PWM-derived TF-binding. As we encourage other researchers to explore and develop these results, our framework is implemented using open-source software and made available as a preconfigured bootable virtual environment. PMID:25231769

  4. Proximal and distal sequences control UV cone pigment gene expression in transgenic zebrafish.

    PubMed

    Luo, Wenqin; Williams, John; Smallwood, Philip M; Touchman, Jeffrey W; Roman, Laura M; Nathans, Jeremy

    2004-04-30

    The molecular basis of cone photoreceptor-specific gene expression is largely unknown. In this study, we define cis-acting DNA sequences that control the cell type-specific expression of the zebrafish UV cone pigment gene by transient expression of green fluorescent protein transgenes following their injection into zebrafish embryos. These experiments show that 4.8 kb of 5'-flanking sequences from the zebrafish UV pigment gene direct expression specifically to UV cones and that this activity requires both distal and proximal sequences. In addition, we demonstrate that a proximal region located between -215 and -110 bp (with respect to the initiator methionine codon) can function in the context of a zebrafish rhodopsin promotor to convert its specificity from rod-only expression to rod and UV cone expression. These experiments demonstrate the power of transient transgenesis in zebrafish to efficiently define cis-acting regulatory sequences in an intact vertebrate. PMID:14966125

  5. Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation

    PubMed Central

    Rouault, Hervé; Santolini, Marc; Schweisguth, François; Hakim, Vincent

    2014-01-01

    Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages. PMID:24682824

  6. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. PMID:23746735

  7. Noise Control in Gene Regulatory Networks with Negative Feedback.

    PubMed

    Hinczewski, Michael; Thirumalai, D

    2016-07-01

    Genes and proteins regulate cellular functions through complex circuits of biochemical reactions. Fluctuations in the components of these regulatory networks result in noise that invariably corrupts the signal, possibly compromising function. Here, we create a practical formalism based on ideas introduced by Wiener and Kolmogorov (WK) for filtering noise in engineered communications systems to quantitatively assess the extent to which noise can be controlled in biological processes involving negative feedback. Application of the theory, which reproduces the previously proven scaling of the lower bound for noise suppression in terms of the number of signaling events, shows that a tetracycline repressor-based negative-regulatory gene circuit behaves as a WK filter. For the class of Hill-like nonlinear regulatory functions, this type of filter provides the optimal reduction in noise. Our theoretical approach can be readily combined with experimental measurements of response functions in a wide variety of genetic circuits, to elucidate the general principles by which biological networks minimize noise. PMID:27095600

  8. Propagation of genetic variation in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Plahte, Erik; Gjuvsland, Arne B.; Omholt, Stig W.

    2013-08-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network’s feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

  9. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  10. From genes to shape: regulatory interactions in leaf development.

    PubMed

    Barkoulas, Michalis; Galinha, Carla; Grigg, Stephen P; Tsiantis, Miltos

    2007-12-01

    In the past two years novel connections were described between auxin function and transcription factor patterning systems involved in both leaf initiation and elaboration of leaf axial patterning. A cascade of small RNA-based regulatory steps was suggested to facilitate delimitation of cell types comprising the upper versus lower parts of the leaf. Developmental regulation of cellular growth emerged as a crucial component in regulation of leaf form with TCP and CUC2 transcription factors playing a key role in this process. Finally, cis-regulatory evolution of developmental genes emerged as a process that likely contributed to diversification of leaf form, while studies in seedless land plants have begun to elucidate the ancestral and derived aspects of leaf development pathways. PMID:17869569

  11. Stable intronic sequence RNAs have possible regulatory roles in Drosophila melanogaster

    PubMed Central

    Osman, Ismail; Tay, Mandy Li-Ian; Zheng, Ruther Teo

    2015-01-01

    Stable intronic sequence RNAs (sisRNAs) have been found in Xenopus tropicalis, human cell lines, and Epstein-Barr virus; however, the biological significance of sisRNAs remains poorly understood. We identify sisRNAs in Drosophila melanogaster by deep sequencing, reverse transcription polymerase chain reaction, and Northern blotting. We characterize a sisRNA (sisR-1) from the regena (rga) locus and show that it can be processed from the precursor messenger RNA (pre-mRNA). We also document a cis-natural antisense transcript (ASTR) from the rga locus, which is highly expressed in early embryos. During embryogenesis, ASTR promotes robust rga pre-mRNA expression. Interestingly, sisR-1 represses ASTR, with consequential effects on rga pre-mRNA expression. Our results suggest a model in which sisR-1 modulates its host gene expression by repressing ASTR during embryogenesis. We propose that sisR-1 belongs to a class of sisRNAs with probable regulatory activities in Drosophila. PMID:26504165

  12. Colorectal cancer risk genes are functionally enriched in regulatory pathways

    PubMed Central

    Lu, Xi; Cao, Mingming; Han, Su; Yang, Youlin; Zhou, Jin

    2016-01-01

    Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. However, the potential genetic mechanisms for newly identified CRC susceptibility variants are still unclear. Here, we selected 85 CRC susceptibility variants with suggestive association P < 1.00E-05 from the National Human Genome Research Institute GWAS catalog. To investigate the underlying genetic pathways where these newly identified CRC susceptibility genes are significantly enriched, we conducted a functional annotation. Using two kinds of SNP to gene mapping methods including the nearest upstream and downstream gene method and the ProxyGeneLD, we got 128 unique CRC susceptibility genes. We then conducted a pathway analysis in GO database using the corresponding 128 genes. We identified 44 GO categories, 17 of which are regulatory pathways. We believe that our results may provide further insight into the underlying genetic mechanisms for these newly identified CRC susceptibility variants. PMID:27146020

  13. Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data

    PubMed Central

    Liu, Zhi-Ping

    2015-01-01

    Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented. PMID:25937810

  14. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  15. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks.

    PubMed

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-06-01

    The diverse, specialized genes present in today's lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins' binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes' evolutionary properties. Slowly evolving ("cold"), old genes tend to interact with each other, as do rapidly evolving ("hot"), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN's community structures and its genes' evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  16. Using shotgun sequence data to find active restriction enzyme genes.

    PubMed

    Zheng, Yu; Posfai, Janos; Morgan, Richard D; Vincze, Tamas; Roberts, Richard J

    2009-01-01

    Whole genome shotgun sequence analysis has become the standard method for beginning to determine a genome sequence. The preparation of the shotgun sequence clones is, in fact, a biological experiment. It determines which segments of the genome can be cloned into Escherichia coli and which cannot. By analyzing the complete set of sequences from such an experiment, it is possible to identify genes lethal to E. coli. Among this set are genes encoding restriction enzymes which, when active in E. coli, lead to cell death by cleaving the E. coli genome at the restriction enzyme recognition sites. By analyzing shotgun sequence data sets we show that this is a reliable method to detect active restriction enzyme genes in newly sequenced genomes, thereby facilitating functional annotation. Active restriction enzyme genes have been identified, and their activity demonstrated biochemically, in the sequenced genomes of Methanocaldococcus jannaschii, Bacillus cereus ATCC 10987 and Methylococcus capsulatus. PMID:18988632

  17. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes

    PubMed Central

    Xiao, Fei; Gao, Lin; Ye, Yusen; Hu, Yuxuan; He, Ruijie

    2016-01-01

    Combining path consistency (PC) algorithms with conditional mutual information (CMI) are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference), to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise. PMID:27171286

  18. Prediction and Validation of Gene Regulatory Elements Activated During Retinoic Acid Induced Embryonic Stem Cell Differentiation.

    PubMed

    Simandi, Zoltan; Horvath, Attila; Nagy, Peter; Nagy, Laszlo

    2016-01-01

    Embryonic development is a multistep process involving activation and repression of many genes. Enhancer elements in the genome are known to contribute to tissue and cell-type specific regulation of gene expression during the cellular differentiation. Thus, their identification and further investigation is important in order to understand how cell fate is determined. Integration of gene expression data (e.g., microarray or RNA-seq) and results of chromatin immunoprecipitation (ChIP)-based genome-wide studies (ChIP-seq) allows large-scale identification of these regulatory regions. However, functional validation of cell-type specific enhancers requires further in vitro and in vivo experimental procedures. Here we describe how active enhancers can be identified and validated experimentally. This protocol provides a step-by-step workflow that includes: 1) identification of regulatory regions by ChIP-seq data analysis, 2) cloning and experimental validation of putative regulatory potential of the identified genomic sequences in a reporter assay, and 3) determination of enhancer activity in vivo by measuring enhancer RNA transcript level. The presented protocol is detailed enough to help anyone to set up this workflow in the lab. Importantly, the protocol can be easily adapted to and used in any cellular model system. PMID:27403939

  19. Nucleotide sequence of the gene for human prothrombin

    SciTech Connect

    Degen, S.J.F.; Davie, E.W.

    1987-09-22

    A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were the compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.

  20. Sequence Requirements for Myosin Gene Expression and Regulation in Caenorhabditis Elegans

    PubMed Central

    Okkema, P. G.; Harrison, S. W.; Plunger, V.; Aryana, A.; Fire, A.

    1993-01-01

    Four Caenorhabditis elegans genes encode muscle-type specific myosin heavy chain isoforms: myo-1 and myo-2 are expressed in the pharyngeal muscles; unc-54 and myo-3 are expressed in body wall muscles. We have used transformation-rescue and lacZ fusion assays to determine sequence requirements for regulated myosin gene expression during development. Multiple tissue-specific activation elements are present for all four genes. For each of the four genes, sequences upstream of the coding region are tissue-specific promoters, as shown by their ability to drive expression of a reporter gene (lacZ) in the appropriate muscle type. Each gene contains at least one additional tissue-specific regulatory element, as defined by the ability to enhance expression of a heterologous promoter in the appropriate muscle type. In rescue experiments with unc-54, two further requirements apparently independent of tissue specificity were found: sequences within the 3' non-coding region are essential for activity while an intron near the 5' end augments expression levels. The general intron stimulation is apparently independent of intron sequence, indicating a mechanistic effect of splicing. To further characterize the myosin gene promoters and to examine the types of enhancer sequences in the genome, we have initiated a screen of C. elegans genomic DNA for fragments capable of enhancing the myo-2 promoter. The properties of enhancers recovered from this screen suggest that the promoter is limited to muscle cells in its ability to respond to enhancers. PMID:8244003

  1. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  2. Identification of a DNA methylation-dependent activator sequence in the pseudoxanthoma elasticum gene, ABCC6.

    PubMed

    Arányi, Tamás; Ratajewski, Marcin; Bardóczy, Viola; Pulaski, Lukasz; Bors, András; Tordai, Attila; Váradi, András

    2005-05-13

    ABCC6 encodes MRP6, a member of the ABC protein family with an unknown physiological role. The human ABCC6 and its two pseudogenes share 99% identical DNA sequence. Loss-of-function mutations of ABCC6 are associated with the development of pseudoxanthoma elasticum (PXE), a recessive hereditary disorder affecting the elastic tissues. Various disease-causing mutations were found in the coding region; however, the mutation detection rate in the ABCC6 coding region of bona fide PXE patients is only approximately 80%. This suggests that polymorphisms or mutations in the regulatory regions may contribute to the development of the disease. Here, we report the first characterization of the ABCC6 gene promoter. Phylogenetic in silico analysis of the 5' regulatory regions revealed the presence of two evolutionarily conserved sequence elements embedded in CpG islands. The study of DNA methylation of ABCC6 and the pseudogenes identified a correlation between the methylation of the CpG island in the proximal promoter and the ABCC6 expression level in cell lines. Both activator and repressor sequences were uncovered in the proximal promoter by reporter gene assays. The most potent activator sequence was one of the conserved elements protected by DNA methylation on the endogenous gene in non-expressing cells. Finally, in vitro methylation of this sequence inhibits the transcriptional activity of the luciferase promoter constructs. Altogether these results identify a DNA methylation-dependent activator sequence in the ABCC6 promoter. PMID:15760889

  3. The human actin-regulatory protein Cap G: Gene structure and chromosome location

    SciTech Connect

    Mishra, V.S.; Southwick, F.S.; Henske, E.P.; Kwiatkowski, D.J.

    1994-10-01

    Cap G (formerly called macrophage capping protein or gCap39) is a member of the gelsolin/villin family of actin-regulatory proteins. Unlike all other members of this family, Cap G caps the barbed ends of actin filaments, but does not sever them. This protein is half the molecular weight and contains half the number of repeat subunits (3 vs. 6) of gelsolin and villin, suggesting that these two proteins may have arisen by gene duplication of the Cap G gene. To investigate this possibility we have cloned and sequenced the human Cap G gene (gene symbol CAPG). The gene is 16.6 kb in size, contains 10 exons and 9 introns, and is located on the proximal short arm of chromosome 2. The open reading frame is 6.9 kb, having 9 exons and 8 introns. This region contains 3 splice sites that are nearly identical to the human gelsolin gene, but shares only one with villin, indicating that CAPG is more closely related to gelsolin. Further comparisons of these three genes, however, indicate that the evolutionary steps resulting in human gelsolin and villin are likely to have been more complex than a simple tandem duplication of the Cap G gene. 30 refs., 4 figs., 2 tabs.

  4. Redeployment of a conserved gene regulatory network during Aedes aegypti development.

    PubMed

    Suryamohan, Kushal; Hanson, Casey; Andrews, Emily; Sinha, Saurabh; Scheel, Molly Duman; Halfon, Marc S

    2016-08-15

    Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system. PMID:27341759

  5. Prediction of Regulatory Interactions from Genome Sequences Using a Biophysical Model for the Arabidopsis LEAFY Transcription Factor[C][W

    PubMed Central

    Moyroud, Edwige; Minguet, Eugenio Gómez; Ott, Felix; Yant, Levi; Posé, David; Monniaux, Marie; Blanchet, Sandrine; Bastien, Olivier; Thévenon, Emmanuel; Weigel, Detlef; Schmid, Markus; Parcy, François

    2011-01-01

    Despite great advances in sequencing technologies, generating functional information for nonmodel organisms remains a challenge. One solution lies in an improved ability to predict genetic circuits based on primary DNA sequence in combination with detailed knowledge of regulatory proteins that have been characterized in model species. Here, we focus on the LEAFY (LFY) transcription factor, a conserved master regulator of floral development. Starting with biochemical and structural information, we built a biophysical model describing LFY DNA binding specificity in vitro that accurately predicts in vivo LFY binding sites in the Arabidopsis thaliana genome. Applying the model to other plant species, we could follow the evolution of the regulatory relationship between LFY and the AGAMOUS (AG) subfamily of MADS box genes and show that this link predates the divergence between monocots and eudicots. Remarkably, our model succeeds in detecting the connection between LFY and AG homologs despite extensive variation in binding sites. This demonstrates that the cis-element fluidity recently observed in animals also exists in plants, but the challenges it poses can be overcome with predictions grounded in a biophysical model. Therefore, our work opens new avenues to deduce the structure of regulatory networks from mere inspection of genomic sequences. PMID:21515819

  6. Ensemble Inference and Inferability of Gene Regulatory Networks

    PubMed Central

    Ud-Dean, S. M. Minhaz; Gunawan, Rudiyanto

    2014-01-01

    The inference of gene regulatory network (GRN) from gene expression data is an unsolved problem of great importance. This inference has been stated, though not proven, to be underdetermined implying that there could be many equivalent (indistinguishable) solutions. Motivated by this fundamental limitation, we have developed new framework and algorithm, called TRaCE, for the ensemble inference of GRNs. The ensemble corresponds to the inherent uncertainty associated with discriminating direct and indirect gene regulations from steady-state data of gene knock-out (KO) experiments. We applied TRaCE to analyze the inferability of random GRNs and the GRNs of E. coli and yeast from single- and double-gene KO experiments. The results showed that, with the exception of networks with very few edges, GRNs are typically not inferable even when the data are ideal (unbiased and noise-free). Finally, we compared the performance of TRaCE with top performing methods of DREAM4 in silico network inference challenge. PMID:25093509

  7. Diverse Gene Expression in Human Regulatory T Cell Subsets Uncovers Connection between Regulatory T Cell Genes and Suppressive Function.

    PubMed

    Hua, Jing; Davis, Scott P; Hill, Jonathan A; Yamagata, Tetsuya

    2015-10-15

    Regulatory T (Treg) cells have a critical role in the control of immunity, and their diverse subpopulations may allow adaptation to different types of immune responses. In this study, we analyzed human Treg cell subpopulations in the peripheral blood by performing genome-wide expression profiling of 40 Treg cell subsets from healthy donors. We found that the human peripheral blood Treg cell population is comprised of five major genomic subgroups, represented by 16 tractable subsets with a particular cell surface phenotype. These subsets possess a range of suppressive function and cytokine secretion and can exert a genomic footprint on target effector T (Teff) cells. Correlation analysis of variability in gene expression in the subsets identified several cell surface molecules associated with Treg suppressive function, and pharmacological interrogation revealed a set of genes having causative effect. The five genomic subgroups of Treg cells imposed a preserved pattern of gene expression on Teff cells, with a varying degree of genes being suppressed or induced. Notably, there was a cluster of genes induced by Treg cells that bolstered an autoinhibitory effect in Teff cells, and this induction appears to be governed by a different set of genes than ones involved in counteracting Teff activation. Our work shows an example of exploiting the diversity within human Treg cell subpopulations to dissect Treg cell biology. PMID:26371251

  8. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification.

    PubMed

    Li, Cong-Jun; Li, Robert W; Baldwin, Ransom L; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  9. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification

    PubMed Central

    Li, Cong-Jun; Li, Robert W.; Baldwin, Ransom L.; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  10. Recognition of Yeast Species from Gene Sequence Comparisons

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...