Science.gov

Sample records for gene sequences regulatory

  1. Modeling DNA sequence-based cis-regulatory gene networks.

    PubMed

    Bolouri, Hamid; Davidson, Eric H

    2002-06-01

    Gene network analysis requires computationally based models which represent the functional architecture of regulatory interactions, and which provide directly testable predictions. The type of model that is useful is constrained by the particular features of developmentally active cis-regulatory systems. These systems function by processing diverse regulatory inputs, generating novel regulatory outputs. A computational model which explicitly accommodates this basic concept was developed earlier for the cis-regulatory system of the endo16 gene of the sea urchin. This model represents the genetically mandated logic functions that the system executes, but also shows how time-varying kinetic inputs are processed in different circumstances into particular kinetic outputs. The same basic design features can be utilized to construct models that connect the large number of cis-regulatory elements constituting developmental gene networks. The ultimate aim of the network models discussed here is to represent the regulatory relationships among the genomic control systems of the genes in the network, and to state their functional meaning. The target site sequences of the cis-regulatory elements of these genes constitute the physical basis of the network architecture. Useful models for developmental regulatory networks must represent the genetic logic by which the system operates, but must also be capable of explaining the real time dynamics of cis-regulatory response as kinetic input and output data become available. Most importantly, however, such models must display in a direct and transparent manner fundamental network design features such as intra- and intercellular feedback circuitry; the sources of parallel inputs into each cis-regulatory element; gene battery organization; and use of repressive spatial inputs in specification and boundary formation. Successful network models lead to direct tests of key architectural features by targeted cis-regulatory analysis. PMID

  2. Identification of potential regulatory motifs in odorant receptor genes by analysis of promoter sequences

    PubMed Central

    Michaloski, Jussara S.; Galante, Pedro A.F.

    2006-01-01

    Mouse odorant receptors (ORs) are encoded by >1000 genes dispersed throughout the genome. Each olfactory neuron expresses one single OR gene, while the rest of the genes remain silent. The mechanisms underlying OR gene expression are poorly understood. Here, we investigated if OR genes share common cis-regulatory sequences in their promoter regions. We carried out a comprehensive analysis in which the upstream regions of a large number of OR genes were compared. First, using RLM-RACE, we generated cDNAs containing the complete 5′-untranslated regions (5′-UTRs) for a total number of 198 mouse OR genes. Then, we aligned these cDNA sequences to the mouse genome so that the 5′ structure and transcription start sites (TSSs) of the OR genes could be precisely determined. Sequences upstream of the TSSs were retrieved and browsed for common elements. We found DNA sequence motifs that are overrepresented in the promoter regions of the OR genes. Most motifs resemble O/E-like sites and are preferentially localized within 200 bp upstream of the TSSs. Finally, we show that these motifs specifically interact with proteins extracted from nuclei prepared from the olfactory epithelium, but not from brain or liver. Our results show that the OR genes share common promoter elements. The present strategy should provide information on the role played by cis-regulatory sequences in OR gene regulation. PMID:16902085

  3. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    PubMed

    Petrovski, Slavé; Gussow, Ayal B; Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H; Allen, Andrew S; Goldstein, David B

    2015-09-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, nc

  4. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  5. Phylogenetic Relationships and the Evolution of Regulatory Gene Sequences in the Parrotfishes

    PubMed Central

    Smith, Lydia L.; Fessler, Jennifer L.; Alfaro, Michael E.; Streelman, J. Todd; Westneat, Mark W.

    2008-01-01

    Regulatory genes control the expression of other genes and are key components of developmental processes such as segmentation and embryonic construction of the skull in vertebrates. Here we examine the variability and evolution of three vertebrate regulatory genes, addressing issues of their utility for phylogenetics and comparing the rates of genetic change seen in regulatory loci to the rates seen in other genes in the parrotfishes. The parrotfishes are a diverse group of colorful fishes from coral reefs and seagrasses worldwide and have been placed phylogenetically within the family Labridae. We tested phylogenetic hypotheses among the parrotfishes, with a focus on the genera Chlorurus and Scarus, by analyzing eight gene fragments for 42 parrotfishes and eight outgroup species. We sequenced mitochondrial 12s rRNA (967 bp), 16s rRNA (577 bp), and cytochrome b (477 bp). From the nuclear genome, we sequenced part of the protein-coding genes rag2 (715 bp), tmo4c4 (485 bp), and the developmental regulatory genes otx1 (672 bp), bmp4 (488 bp), and dlx2 (522 bp). Bayesian, likelihood, and parsimony analyses on the resulting 4903 bp of DNA sequence produced similar topologies that confirm the monophyly of the scarines and provide a phylogeny at the species level for portions of the genera Scarus and Chlorurus. Four major clades of Scarus were recovered, with three distributed in the Indo-Pacific and one containing Caribbean/Atlantic taxa. Molecular rates suggest a Miocene origin of the parrotfishes (22 mya) and a recent divergence of species within Scarus and Chlorurus, within the past 5 million years. Developmentally important genes made a significant contribution to phylogenetic structure, and rates of genetic evolution were high in bmp4, similar to other coding nuclear genes, but low in otx1 and the dlx2 exons. Synonymous and nonsynonymous substitution patterns in developmental regulatory genes support the hypothesis of stabilizing selection during the history of

  6. Two Lamprey Hedgehog Genes Share Non-Coding Regulatory Sequences and Expression Patterns with Gnathostome Hedgehogs

    PubMed Central

    Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-01-01

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences. PMID:20967201

  7. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  8. Detecting Functional Divergence after Gene Duplication through Evolutionary Changes in Posttranslational Regulatory Sequences

    PubMed Central

    Nguyen Ba, Alex N.; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L.; Landry, Christian R.; Moses, Alan M.

    2014-01-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. PMID:25474245

  9. A transcriptional regulatory element in the coding sequence of the human Bcl-2 gene

    PubMed Central

    Lang, Georgina; Gombert, Wendy M; Gould, Hannah J

    2005-01-01

    We investigated the protein-binding sites in a DNAse I hypersensitive site associated with bcl-2 gene expression in human B cells. We mapped this hypersensitive site to the coding sequence of exon 2 of the bcl-2 gene in the bcl-2-expressing REH B-cell line. Electrophoretic mobility shift assays (EMSAs) with extracts from REH cells revealed three previously unrecognized B-Myb-binding sites in this sequence. The protein was identified as B-Myb by using a specific antibody and EMSAs. Accordingly, the levels of B-Myb and bcl-2 proteins, and of Myb EMSA activity, were correlated over a wide range of cell lines, representing different stages of B-cell development. Transfection of REH cells with antisense B-myb down-regulated EMSA activity and the level of bcl-2, and led to the apoptosis of REH cells. Transfection of the bcl-2-non-expressing RPMI 8226 cell line with a B-Myb expression vector induced B-Myb EMSA activity and the expression of bcl-2. Reporter assays indicated that the HSS8 sequence containing the three B-Myb sites may act as an enhancer when it is linked to the bcl-2 gene promoter. Interaction of B-Myb with HSS8 may enhance bcl-2 gene expression by co-operating with positive regulatory elements (e.g. previously identified B-Myb response elements) or silencing negative response elements in the bcl-2 gene promoter. PMID:15606792

  10. Regulatory sequences of Arabidopsis drive reporter gene expression in nematode feeding structures.

    PubMed Central

    Barthels, N; van der Lee, F M; Klap, J; Goddijn, O J; Karimi, M; Puzio, P; Grundler, F M; Ohl, S A; Lindsey, K; Robertson, L; Robertson, W M; Van Montagu, M; Gheysen, G; Sijmons, P C

    1997-01-01

    In the quest for plant regulatory sequences capable of driving nematode-triggered effector gene expression in feeding structures, we show that promoter tagging is a valuable tool. A large collection of transgenic Arabidopsis plants was generated. They were transformed with a beta-glucuronidase gene functioning as a promoter tag. Three T-DNA constructs, pGV1047, p delta gusBin19, and pMOG553, were used. Early responses to nematode invasion were of primary interest. Six lines exhibiting beta-glucuronidase activity in syncytia induced by the beet cyst nematode were studied. Reporter gene activation was also identified in galls induced by root knot and ectoparasitic nematodes. Time-course studies revealed that all six tags were differentially activated during the development of the feeding structure. T-DNA-flanking regions responsible for the observed responses after nematode infection were isolated and characterized for promoter activity. PMID:9437858

  11. Origins of Transcriptional Transition: Balance between Upstream and Downstream Regulatory Gene Sequences

    PubMed Central

    Sala, Adrien; Shoaib, Muhammad; Anufrieva, Olga; Mutharasu, Gnanavel; Yli-Harja, Olli

    2015-01-01

    ABSTRACT By measuring individual mRNA production at the single-cell level, we investigated the lac promoter’s transcriptional transition during cell growth phases. In exponential phase, variation in transition rates generates two mixed phenotypes, low and high numbers of mRNAs, by modulating their burst frequency and sizes. Independent activation of the regulatory-gene sequence does not produce bimodal populations at the mRNA level, but bimodal populations are produced when the regulatory gene is activated coordinately with the upstream and downstream region promoter sequence (URS and DRS, respectively). Time-lapse microscopy of mRNAs for lac and a variant lac promoter confirm this observation. Activation of the URS/DRS elements of the promoter reveals a counterplay behavior during cell phases. The promoter transition rate coupled with cell phases determines the mRNA and transcriptional noise. We further show that bias in partitioning of RNA does not lead to phenotypic switching. Our results demonstrate that the balance between the URS and the DRS in transcriptional regulation determines population diversity. PMID:25626902

  12. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  13. Sequence analysis of the myosin regulatory light chain gene of the vestimentiferan Riftia pachyptila.

    PubMed

    Ravaux, J; Hassanin, A; Deutsch, J; Gaill, F; Markmann-Mulisch, U

    2001-01-24

    We have isolated and characterized a cDNA (DNA complementary to RNA) clone (Rf69) from the vestimentiferan Riftia pachyptila. The cDNA insert consists of 1169 base pairs. The aminoacid sequence deduced from the longest reading frame is 193 residues in length, and clearly characterized it as a myosin regulatory light chain (RLC). The RLC primary structure is described in relation to its function in muscle contraction. The comparison with other RLCs suggested that Riftia myosin is probably regulated through its RLC either by phosphorylation like the vertebrate smooth muscle myosins, and/or by Ca2+-binding like the mollusk myosins. Riftia RLC possesses a N-terminal extension lacking in all other species besides the earthworm Lumbricus terrestris. Aminoacid sequence comparisons with a number of RLCs from vertebrates and invertebrates revealed a relatively high identity score (64%) between Riftia RLC and the homologous gene from Lumbricus. The relationships between the members of the myosin RLCs were examined by two phylogenetic methods, i.e. distance matrix and maximum parsimony. The resulting trees depict the grouping of the RLCs according to their role in myosin activity regulation. In all trees, Riftia RLC groups with RLCs that depend on Ca2+-binding for myosin activity regulation. PMID:11223252

  14. Coordinate cytokine regulatory sequences

    DOEpatents

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  15. Different regulatory sequences control creatine kinase-M gene expression in directly injected skeletal and cardiac muscle.

    PubMed Central

    Vincent, C K; Gualberto, A; Patel, C V; Walsh, K

    1993-01-01

    Regulatory sequences of the M isozyme of the creatine kinase (MCK) gene have been extensively mapped in skeletal muscle, but little is known about the sequences that control cardiac-specific expression. The promoter and enhancer sequences required for MCK gene expression were assayed by the direct injection of plasmid DNA constructs into adult rat cardiac and skeletal muscle. A 700-nucleotide fragment containing the enhancer and promoter of the rabbit MCK gene activated the expression of a downstream reporter gene in both muscle tissues. Deletion of the enhancer significantly decreased expression in skeletal muscle but had no detectable effect on expression in cardiac muscle. Further deletions revealed a CArG sequence motif at position -179 within the promoter that was essential for cardiac-specific expression. The CArG element of the MCK promoter bound to the recombinant serum response factor and YY1, transcription factors which control expression from structurally similar elements in the skeletal actin and c-fos promoters. MCK-CArG-binding activities that were similar or identical to serum response factor and YY1 were also detected in extracts from adult cardiac muscle. These data suggest that the MCK gene is controlled by different regulatory programs in adult cardiac and skeletal muscle. Images PMID:8423791

  16. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions.

    PubMed

    Besemer, J; Lomsadze, A; Borodovsky, M

    2001-06-15

    Improving the accuracy of prediction of gene starts is one of a few remaining open problems in computer prediction of prokaryotic genes. Its difficulty is caused by the absence of relatively strong sequence patterns identifying true translation initiation sites. In the current paper we show that the accuracy of gene start prediction can be improved by combining models of protein-coding and non-coding regions and models of regulatory sites near gene start within an iterative Hidden Markov model based algorithm. The new gene prediction method, called GeneMarkS, utilizes a non-supervised training procedure and can be used for a newly sequenced prokaryotic genome with no prior knowledge of any protein or rRNA genes. The GeneMarkS implementation uses an improved version of the gene finding program GeneMark.hmm, heuristic Markov models of coding and non-coding regions and the Gibbs sampling multiple alignment program. GeneMarkS predicted precisely 83.2% of the translation starts of GenBank annotated Bacillus subtilis genes and 94.4% of translation starts in an experimentally validated set of Escherichia coli genes. We have also observed that GeneMarkS detects prokaryotic genes, in terms of identifying open reading frames containing real genes, with an accuracy matching the level of the best currently used gene detection methods. Accurate translation start prediction, in addition to the refinement of protein sequence N-terminal data, provides the benefit of precise positioning of the sequence region situated upstream to a gene start. Therefore, sequence motifs related to transcription and translation regulatory sites can be revealed and analyzed with higher precision. These motifs were shown to possess a significant variability, the functional and evolutionary connections of which are discussed. PMID:11410670

  17. Nucleotide sequence of the regulatory locus controlling expression of bacterial genes for bioluminescence.

    PubMed Central

    Engebrecht, J; Silverman, M

    1987-01-01

    Production of light by the marine bacterium Vibrio fischeri and by recombinant hosts containing cloned lux genes is controlled by the density of the culture. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes and two closely linked divergent promoters. As part of a genetic analysis to understand the regulation of bioluminescence, we have sequenced the region of DNA containing this control circuit. Open reading frames corresponding to luxR and luxI were identified; transcription start sites were defined by S1 nuclease mapping and sequences resembling promoter elements were located. Images PMID:3697093

  18. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  19. Cloning and Characterization of 5′ Flanking Regulatory Sequences of AhLEC1B Gene from Arachis Hypogaea L.

    PubMed Central

    Tang, Guiying; Xu, Pingli; Liu, Wei; Liu, Zhanji; Shan, Lei

    2015-01-01

    LEAFY COTYLEDON1 (LEC1) is a B subunit of Nuclear Factor Y (NF-YB) transcription factor that mainly accumulates during embryo development. We cloned the 5′ flanking regulatory sequence of AhLEC1B gene, a homolog of Arabidopsis LEC1, and analyzed its regulatory elements using online software. To identify the crucial regulatory region, we generated a series of GUS expression frameworks driven by different length promoters with 5′ terminal and/or 3′ terminal deletion. We further characterized the GUS expression patterns in the transgenic Arabidopsis lines. Our results show that both the 65bp proximal promoter region and the 52bp 5′ UTR of AhLEC1B contain the key motifs required for the essential promoting activity. Moreover, AhLEC1B is preferentially expressed in the embryo and is co-regulated by binding of its upstream genes with both positive and negative corresponding cis-regulatory elements. PMID:26426444

  20. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published. PMID:18495751

  1. RefNetBuilder: a platform for construction of integrated reference gene regulatory networks from expressed sequence tags

    PubMed Central

    2011-01-01

    Background Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs). Results RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG). It features sequence alignment tools such as BLAST to allow mapping ESTs to pathways and GRNs in model organisms. A scoring algorithm was incorporated to rank and select the best match for each query EST. We validated RefNetBuilder using DNA sequences of Caenorhabditis elegans, a model organism having manually curated KEGG pathways. Using the earthworm Eisenia fetida as an example, we demonstrated the functionalities and features of RefNetBuilder. Conclusions The RefNetBuilder provides a standalone application for building reference GRNs for non-model organisms on a number of operating system platforms with standard desktop computer hardware. As a new bioinformatic tool aimed for constructing putative GRNs for non-model organisms that have only ESTs available, RefNetBuilder is especially useful to explore pathway- or network-related information in these organisms. PMID:22166047

  2. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  3. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. PMID:26025428

  4. Transcriptional activation of the fra-1 gene by AP-1 is mediated by regulatory sequences in the first intron.

    PubMed Central

    Bergers, G; Graninger, P; Braselmann, S; Wrighton, C; Busslinger, M

    1995-01-01

    Constitutive expression of c-Fos, FosB, Fra-1, or c-Jun in rat fibroblasts leads to up-regulation of the immediate-early gene fra-1. Using the posttranslational FosER induction system, we demonstrate that this AP-1-dependent stimulation of fra-1 expression is rapid, depends on a functional DNA-binding domain of FosER, and is a general phenomenon observed in different cell types. In vitro mutagenesis and functional analysis of the rat fra-1 gene in stably transfected Rat-1A-FosER fibroblasts indicated that basal and AP-1-regulated expression of the fra-1 gene depends on regulatory sequences in the first intron which comprise a consensus AP-1 site and two AP-1-like elements. We have also investigated the transactivating and transforming properties of the Fra-1 protein to address the significance of fra-1 up-regulation. The entire Fra-1 protein fused to the DNA-binding domain of Ga14 is shown to lack any transactivation function, and yet it possesses oncogenic potential, as overexpression of Fra-1 in established rat fibroblasts results in anchorage-independent growth in vitro and tumor development in athymic mice, fra-1 is therefore not only induced by members of the Fos family, but its gene product may also contribute to cellular transformation by these proteins. Together, these data identify fra-1 as a unique member of the fos gene family which is under positive control by AP-1 activity. PMID:7791782

  5. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    PubMed Central

    2011-01-01

    useful markers for population genetics studies and marker-assisted selection. Conclusion We have produced the first comprehensive transcriptome-wide analysis in A. auriculiformis and A. mangium using de novo assembly techniques. Our high quality and comprehensive assemblies allowed the identification of many genes in the lignin biosynthesis and secondary cell wall formation in Acacia hybrids. Our results demonstrated that Next Generation Sequencing is a cost-effective method for gene discovery, identification of regulatory sequences, and informative markers in a non-model plant. PMID:21729267

  6. Plant nitrogen regulatory P-PII genes

    DOEpatents

    Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun

    2001-01-01

    The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the

  7. [Cloning and function identification of gene 'admA' and up-stream regulatory sequence related to antagonistic activity of Enterobacter cloacae B8].

    PubMed

    Zhu, Jun-Li; Li, De-Bao; Yu, Xu-Ping

    2012-04-01

    To reveal the antagonistic mechanism of B8 strain to Xanthomonas oryzae pv. oryzae, transposon tagging method and chromosome walking were deployed to clone antagonistic related fragments around Tn5 insertion site in the mutant strain B8B. The function of up-stream regulatory sequence of gene 'admA' involved in the antagonistic activity was further identified by gene knocking out technique. An antagonistic related left fragment of Tn5 insertion site, 2 608 bp in length, was obtained by tagging with Kan resistance gene of Tn5. A 2 354 bp right fragment of Tn5 insertion site was amplified with 2 rounds of chromosome walking. The length of the B contig around the Tn5 insertion site was 4 611 bp, containing 7 open reading frames (ORFs). Bioinformatic analysis revealed that these ORFs corresponded to the partial coding regions of glyceraldehyde-3-phosphate dehydrogenase, two LysR family transcriptional regulators, hypothetical protein VSWAT3-20465 of Vibrionales and admA, admB, and partial sequence of admC gene of Pantoea agglomerans biosynthetic gene cluster, respectively. Tn5 was inserted in the up-stream of 200 bp or 894 bp of the sequence corresponding to anrP ORF or admA gene on B8B, respectively. The B-1 and B-2 mutants that lost antagonistic activity were selected by homeologuous recombination technology in association with knocking out plasmid pMB-BG. These results suggested that the transcription and expression of anrP gene might be disrupted as a result of the knocking out of up-stream regulatory sequence by Tn5 in B8B strain, further causing biosythesis regulation of the antagonistic related gene cluster. Thus, the antagonistic related genes in B8 strain is a gene family similar as andrimid biosynthetic gene cluster, and the upstream regulatory region appears to be critical for the antibiotics biosynthesis. PMID:22522167

  8. The upstream regulatory sequence of the light harvesting complex Lhcf2 gene of the marine diatom Phaeodactylum tricornutum enhances transcription in an orientation- and distance-independent fashion.

    PubMed

    Russo, Monia Teresa; Annunziata, Rossella; Sanges, Remo; Ferrante, Maria Immacolata; Falciatore, Angela

    2015-12-01

    Diatoms are a key phytoplankton group in the contemporary ocean, showing extraordinary adaptation capacities to rapidly changing environments. The recent availability of whole genome sequences from representative species has revealed distinct features in their genomes, like novel combinations of genes encoding distinct metabolisms and a significant number of diatom-specific genes. However, the regulatory mechanisms driving diatom gene expression are still largely uncharacterized. Considering the wide variety of fields of study orbiting diatoms, ranging from ecology, evolutionary biology to biotechnology, it is thus essential to increase our understanding of fundamental gene regulatory processes such as transcriptional regulation. To this aim, we explored the functional properties of the 5'-flanking region of the Phaeodatylum tricornutum Lhcf2 gene, encoding a member of the Light Harvesting Complex superfamily and we showed that this region enhances transcription of a GUS reporter gene in an orientation- and distance-independent fashion. This represents the first example of a cis-regulatory sequence with enhancer-like features discovered in diatoms and it is instrumental for the generation of novel genetic tools and diatom exploitation in different areas of study. PMID:26117181

  9. Epithelial and endothelial expression of the green fluorescent protein reporter gene under the control of bovine prion protein (PrP) gene regulatory sequences in transgenic mice

    NASA Astrophysics Data System (ADS)

    Lemaire-Vieille, Catherine; Schulze, Tobias; Podevin-Dimster, Valérie; Follet, Jérome; Bailly, Yannick; Blanquet-Grossard, Françoise; Decavel, Jean-Pierre; Heinen, Ernst; Cesbron, Jean-Yves

    2000-05-01

    The expression of the cellular form of the prion protein (PrPc) gene is required for prion replication and neuroinvasion in transmissible spongiform encephalopathies. The identification of the cell types expressing PrPc is necessary to understanding how the agent replicates and spreads from peripheral sites to the central nervous system. To determine the nature of the cell types expressing PrPc, a green fluorescent protein reporter gene was expressed in transgenic mice under the control of 6.9 kb of the bovine PrP gene regulatory sequences. It was shown that the bovine PrP gene is expressed as two populations of mRNA differing by alternative splicing of one 115-bp 5' untranslated exon in 17 different bovine tissues. The analysis of transgenic mice showed reporter gene expression in some cells that have been identified as expressing PrP, such as cerebellar Purkinje cells, lymphocytes, and keratinocytes. In addition, expression of green fluorescent protein was observed in the plexus of the enteric nervous system and in a restricted subset of cells not yet clearly identified as expressing PrP: the epithelial cells of the thymic medullary and the endothelial cells of both the mucosal capillaries of the intestine and the renal capillaries. These data provide valuable information on the distribution of PrPc at the cellular level and argue for roles of the epithelial and endothelial cells in the spread of infection from the periphery to the brain. Moreover, the transgenic mice described in this paper provide a model that will allow for the study of the transcriptional activity of the PrP gene promoter in response to scrapie infection.

  10. Cloning and sequencing of the blood meal-induced late trypsin gene from the mosquito Aedes aegypti and characterization of the upstream regulatory region.

    PubMed

    Barillas-Mury, C; Wells, M A

    1993-01-01

    A 4.1 kb genomic clone of the late trypsin gene from the mosquito Aedes aegypti was isolated, mapped and subcloned. A 1.6 kb subclone, corresponding to 1.1 kb of upstream regulatory region and 0.5 kb of coding region, was sequenced. The gene has no introns within the coding region. The 5' end of the mature mRNA was mapped using primer extension analysis. A TATA box consensus sequence (TATAAA) was found at position -31 from the 5' end of the mature mRNA. A cluster of five repeat sequences homologous to the yeast GCN4 DNA binding site was found within 200 nucleotides upstream of the cap site. GCN4 is required for derepression mediated control of general amino acid biosynthesis in response to amino acid starvation in yeast. It activates the transcription of at least twenty different genes coding for enzymes involved in amino acid biosynthesis. The presence of this cluster of consensus sequences suggests that a protein similar to GCN4 might regulate expression of the late trypsin gene in the mosquito. Southern blot analysis of genomic DNA indicates that late trypsin is a single copy gene. PMID:9087537

  11. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans

    PubMed Central

    Khachatoorian, Careen; Judelson, Howard S.

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies. PMID:26716454

  12. Inactivation, sequence, and lacZ fusion analysis of a regulatory locus required for repression of nitrogen fixation genes in Rhodobacter capsulatus.

    PubMed Central

    Kranz, R G; Pace, V M; Caldicott, I M

    1990-01-01

    Transcription of the genes that code for proteins involved in nitrogen fixation in free-living diazotrophs is typically repressed by high internal oxygen concentrations or exogenous fixed nitrogen. The DNA sequence of a regulatory locus required for repression of Rhodobacter capsulatus nitrogen fixation genes was determined. It was shown that this locus, defined by Tn5 insertions and by ethyl methanesulfonate-derived mutations, is homologous to the glnB gene of other organisms. The R. capsulatus glnB gene was upstream of glnA, the gene for glutamine synthetase, in a glnBA operon. beta-Galactosidase expression from an R. capsulatus glnBA-lacZ translational fusion was increased twofold in cells induced by nitrogen limitation relative to that in cells under nitrogen-sufficient conditions. R. capsulatus nifR1, a gene that was previously shown to be homologous to ntrC and that is required for transcription of nitrogen fixation genes, was responsible for approximately 50% of the transcriptional activation of this glnBA fusion in cells induced under nitrogen-limiting conditions. R. capsulatus GLNB, NIFR1, and NIFR2 (a protein homologous to NTRB) were proposed to transduce the nitrogen status in the cell into repression or activation of other R. capsulatus nif genes. Repression of nif genes in response to oxygen was still present in R. capsulatus glnB mutants and must have occurred at a different level of control in the regulatory circuit. Images FIG. 4 FIG. 5 PMID:2152916

  13. Transgenic LacZ under control of Hec-6st regulatory sequences recapitulates endogenous gene expression on high endothelial venules

    PubMed Central

    Liao, Shan; Bentley, Kevin; Lebrun, Marielle; Lesslauer, Werner; Ruddle, Frank H.; Ruddle, Nancy H.

    2007-01-01

    Hec-6st is a highly specific high endothelial venule (HEV) gene that is crucial for regulating lymphocyte homing to lymph nodes (LN). The enzyme is also expressed in HEV-like vessels in tertiary lymphoid organs that form in chronic inflammation in autoimmunity, graft rejection, and microbial infection. Understanding the molecular nature of Hec-6st regulation is crucial for elucidating its function in development and disease. However, studies of HEV are limited because of the difficulties in isolating and maintaining the unique characteristics of these vessels in vitro. The novel pClasper yeast homologous recombination technique was used to isolate from a BAC clone a 60-kb DNA fragment that included the Hec-6st (Chst4) gene with flanking sequences. Transgenic mice were generated with the β-galactosidase (LacZ) reporter gene inserted in-frame in the exon II of Hec-6st within the isolated BAC DNA fragment. LacZ was expressed specifically on HEV in LN, as indicated by its colocalization with peripheral node vascular addressin. LacZ was increased in nasal-associated lymphoid tissue during development and was reduced in LN and nasal-associated lymphoid tissue by LTβR-Ig (lymphotoxin-β receptor human Ig fusion protein) treatment in a manner identical to the endogenous gene. The transgene was expressed at high levels in lymphoid accumulations with characteristics of tertiary lymphoid organs in the salivary glands of aged mice. Thus, the Hec-6s-LacZ construct faithfully reproduces Hec-6st tissue-specific expression and can be used in further studies to drive expression of reporter or effector genes, which could visualize or inhibit HEV in autoimmunity. PMID:17360566

  14. Building Developmental Gene Regulatory Networks

    PubMed Central

    Li, Enhu; Davidson, Eric H.

    2009-01-01

    Animal development is an elaborate process programmed by genomic regulatory instructions. Regulatory genes encode transcription factors and signal molecules, and their expression is under the control of cis-regulatory modules that define the logic of transcriptional responses to the inputs of other regulatory genes. The functional linkages amongst regulatory genes constitute the gene regulatory networks (GRNs) that govern cell specification and patterning in development. Constructing such networks requires identification of the regulatory genes involved and characterization of their temporal and spatial expression patterns. Interactions (activation/repression) among transcription factors or signals can be investigated by large-scale perturbation analysis, in which the function of each gene is specifically blocked. Resultant expression changes are then integrated to identify direct linkages, and to reveal the structure of the GRN. Predicted GRN linkages can be tested and verified by cis-regulatory analysis. The explanatory power of the GRN was shown in the lineage specification of sea urchin endomesoderm. Acquiring such networks is essential for a systematic and mechanistic understanding of the developmental process. PMID:19530131

  15. Gene regulatory mechanisms underpinning prostate cancer susceptibility.

    PubMed

    Whitington, Thomas; Gao, Ping; Song, Wei; Ross-Adams, Helen; Lamb, Alastair D; Yang, Yuehong; Svezia, Ilaria; Klevebring, Daniel; Mills, Ian G; Karlsson, Robert; Halim, Silvia; Dunning, Mark J; Egevad, Lars; Warren, Anne Y; Neal, David E; Grönberg, Henrik; Lindberg, Johan; Wei, Gong-Hong; Wiklund, Fredrik

    2016-04-01

    Molecular characterization of genome-wide association study (GWAS) loci can uncover key genes and biological mechanisms underpinning complex traits and diseases. Here we present deep, high-throughput characterization of gene regulatory mechanisms underlying prostate cancer risk loci. Our methodology integrates data from 295 prostate cancer chromatin immunoprecipitation and sequencing experiments with genotype and gene expression data from 602 prostate tumor samples. The analysis identifies new gene regulatory mechanisms affected by risk locus SNPs, including widespread disruption of ternary androgen receptor (AR)-FOXA1 and AR-HOXB13 complexes and competitive binding mechanisms. We identify 57 expression quantitative trait loci at 35 risk loci, which we validate through analysis of allele-specific expression. We further validate predicted regulatory SNPs and target genes in prostate cancer cell line models. Finally, our integrated analysis can be accessed through an interactive visualization tool. This analysis elucidates how genome sequence variation affects disease predisposition via gene regulatory mechanisms and identifies relevant genes for downstream biomarker and drug development. PMID:26950096

  16. Variations in the coding and regulatory sequences of the angiogenin (ANG) gene are not associated to ALS (amyotrophic lateral sclerosis) in the Italian population.

    PubMed

    Corrado, Lucia; Battistini, Stefania; Penco, Silvana; Bergamaschi, Laura; Testa, Lucia; Ricci, Claudia; Giannini, Fabio; Greco, Giuseppe; Patrosso, Maria Cristina; Pileggi, Simona; Causarano, Renzo; Mazzini, Letizia; Momigliano-Richiardi, Patricia; D'Alfonso, Sandra

    2007-07-15

    Potentially causative missense variations in the ANG gene and a positive association with the synonymous rs11701-G substitution was detected mainly in Irish and Scottish ALS patients. We screened 262 Italian SOD1 negative ALS patients (250 sporadic) and 415 matched controls for sequence variations in the coding, 3'/5' UTR and 5' flanking (642 bp) regions of the ANG gene. We identified 53 sequence variations of which 46 new, 20 with a minor allele frequency (MAF) >or=0.01 and only three localised in the coding sequence, namely the missense I46V, identified in one patient and two controls, and the synonymous G86G and T97T corresponding to rs11701 and rs2228653. None of the detected SNPs or of their haplotypic combinations was significantly associated with ALS susceptibility or clinical features. In conclusion, we did not detect the association with rs11701-G or with any other newly detected variation in the ANG regulatory region. Furthermore we did not identify potentially causal mutations in the coding region. PMID:17462671

  17. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease.

    PubMed

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-05-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  18. Organization, regulatory sequences, and alternatively spliced transcripts of the mucosal addressin cell adhesion molecule-1 (MAdCAM-1) gene

    SciTech Connect

    Sampaio, S.O.; Mei, C.; Butcher, E.C.

    1995-09-01

    The mucosal addressin cell adhesion molecule-1 (MAdCAM-1) is expressed selectively at venular sites of lymphocyte extravasation into mucosal lymphoid tissues and lamina propria, where it directs local lymphocyte trafficking. MAdCAM-1 is a multifunctional type I transmembrane adhesion molecule comprising two distal Ig domains involved in {alpha}4{beta}7 integrin binding, a mucin-like region able to display L-selectin-binding carbohydrates, and a membrane-proximal Ig domain homologous to IgA. We show in this work that the MAdCAM-1 gene is located on chromosome 10 and contains five exons. The signal peptide and each one of the three Ig domains are encoded by a distinct exon, whereas the transmembrane, cytoplasmic tail, and 3{prime}-untranslated region of MAdCAM-1 are combined on a single exon. The mucin-like region and the third Ig domain are encoded together on exon 4. An alternatively spliced MAdCAM-1 mRNA is identified that lacks the mucin/IgA-homologous exon 4-encoded sequences. This short variant of MAdCAM-1 may be specialized to support {alpha}4{beta}7-dependent adhesion strengthening, independent of carbohydrate-presenting function. Sequences 5{prime} of the transcription start site include tandem nuclear factor-KB sites; AP-1, AP-2, and signal peptide-1 binding sites; and an estrogen response element. Our findings reinforce the correspondence between the multidomain structure and versatile functions of this vascular addressin, and suggest an additional level of regulation of carbohydrate-presenting capability, and thus of its importance in lectin-mediated vs. {alpha}4{beta}7-dependent adhesive events in lymphocyte trafficking. 46 refs., 6 figs., 1 tab.

  19. Chicken interferon consensus sequence-binding protein (ICSBP) and interferon regulatory factor (IRF) 1 genes reveal evolutionary conservation in the IRF gene family.

    PubMed Central

    Jungwirth, C; Rebbert, M; Ozato, K; Degen, H J; Schultz, U; Dawid, I B

    1995-01-01

    Members of the IRF family mediate transcriptional responses to interferons (IFNs) and to virus infection. So far, proteins of this family have been studied only among mammalian species. Here we report the isolation of cDNA clones encoding two members of this family from chicken, interferon consensus sequence-binding protein (ICSBP) and IRF-1. The predicted chicken ICSBP and IRF-1 proteins show high levels of sequence similarity to their corresponding human and mouse counterparts. Sequence identities in the putative DNA-binding domains of chicken and human ICSBP and IRF-1 were 97% and 89%, respectively, whereas the C-terminal regions showed identities of 64% and 51%; sequence relationships with mouse ICSBP and IRF-1 are very similar. Chicken ICSBP was found to be expressed in several embryonic tissues, and both chicken IRF-1 and ICSBP were strongly induced in chicken fibroblasts by IFN treatment, supporting the involvement of these factors in IFN-regulated gene expression. The presence of proteins homologous to mammalian IRF family members, together with earlier observations on the occurrence of functionally homologous IFN-responsive elements in chicken and mammalian genes, highlights the conservation of transcriptional mechanisms in the IFN system, a finding that contrasts with the extensive sequence and functional divergence of the IFNs. Images Fig. 3 Fig. 4 Fig. 5 PMID:7536924

  20. Vision from next generation sequencing: Multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease

    PubMed Central

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-01-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of “gene” itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  1. Identification of DVA Interneuron Regulatory Sequences in Caenorhabditis elegans

    PubMed Central

    Puckett Robinson, Carmie; Schwarz, Erich M.; Sternberg, Paul W.

    2013-01-01

    Background The identity of each neuron is determined by the expression of a distinct group of genes comprising its terminal gene battery. The regulatory sequences that control the expression of such terminal gene batteries in individual neurons is largely unknown. The existence of a complete genome sequence for C. elegans and draft genomes of other nematodes let us use comparative genomics to identify regulatory sequences directing expression in the DVA interneuron. Methodology/Principal Findings Using phylogenetic comparisons of multiple Caenorhabditis species, we identified conserved non-coding sequences in 3 of 10 genes (fax-1, nmr-1, and twk-16) that direct expression of reporter transgenes in DVA and other neurons. The conserved region and flanking sequences in an 85-bp intronic region of the twk-16 gene directs highly restricted expression in DVA. Mutagenesis of this 85 bp region shows that it has at least four regions. The central 53 bp region contains a 29 bp region that represses expression and a 24 bp region that drives broad neuronal expression. Two short flanking regions restrict expression of the twk-16 gene to DVA. A shared GA-rich motif was identified in three of these genes but had opposite effects on expression when mutated in the nmr-1 and twk-16 DVA regulatory elements. Conclusions/Significance We identified by multi-species conservation regulatory regions within three genes that direct expression in the DVA neuron. We identified four contiguous regions of sequence of the twk-16 gene enhancer with positive and negative effects on expression, which combined to restrict expression to the DVA neuron. For this neuron a single binding site may thus not achieve sufficient specificity for cell specific expression. One of the positive elements, an 8-bp sequence required for expression was identified in silico by sequence comparisons of seven nematode species, demonstrating the potential resolution of expanded multi-species phylogenetic comparisons. PMID

  2. The kil-kor regulon of broad-host-range plasmid RK2: nucleotide sequence, polypeptide product, and expression of regulatory gene korC.

    PubMed Central

    Kornacki, J A; Burlage, R S; Figurski, D H

    1990-01-01

    Broad-host-range plasmid RK2 encodes several kil operons (kilA, kilB, kilC, kilE) whose expression is potentially lethal to Escherichia coli host cells. The kil operons and the RK2 replication initiator gene (trfA) are coregulated by various combinations of kor genes (korA, korB, korC, korE). This regulatory network is called the kil-kor regulon. Presented here are studies on the structure, product, and expression of korC. Genetic mapping revealed the precise location of korC in a region near transposon Tn1. We determined the nucleotide sequence of this region and identified the korC structural gene by analysis of korC mutants. Sequence analysis predicts the korC product to be a polypeptide of 85 amino acids with a molecular mass of 9,150 daltons. The KorC polypeptide was identified in vivo by expressing wild-type and mutant korC alleles from a bacteriophage T7 RNA polymerase-dependent promoter. The predicted structure of KorC polypeptide has a net positive charge and a helix-turn-helix region similar to those of known DNA-binding proteins. These properties are consistent with the repressorlike function of KorC protein, and we discuss the evidence that KorA and KorC proteins act as corepressors in the control of the kilC and kilE operons. Finally, we show that korC is expressed from the bla promoters within the upstream transposon Tn1, suggesting that insertion of Tn1 interrupted a plasmid operon that may have originally included korC and kilC. Images PMID:2160936

  3. Comparative inter-strain sequence analysis of the putative regulatory region of murine psychostimulant-regulated gene GNB1 (G protein beta 1 subunit gene).

    PubMed

    Kitanaka, Nobue; Kitanaka, Junichi; Walther, Donna; Wang, Xiao-Bing; Uhl, George R

    2003-08-01

    We isolated a cDNA clone from a murine genomic library of C57BL/6 strain, carrying 13.8 kb of nucleotides including exon 1 of heterotrimeric GTP-binding protein beta 1 subunit gene (genetic symbol, GNB1) and 10.6 kb of the 5' flanking region. Sequence comparison with GNB1 gene locus from 129Sv strain revealed a 0.2% divergence in a 13.2 kb common region between these two strains. The divergence consisted of eight single nucleotide polymorphisms, three insertions and one deletion, with 129Sv used as the reference. The exon 1 and the putative regulation elements, such as cyclic AMP response element, AP1, AP2, Sp1 and nuclear factor-kappa B recognition sites, were perfectly conserved. The expression of GNB1 mRNA was significantly increased in mouse striatum 2 h after single methamphetamine administration with an approximately 150% expression level compared with the basal level. In contrast, no change in the expression level was observed in the cerebral cortex. After the chronic methamphetamine treatment regimen, the expression level of GNB1 mRNA did not change in any brain regions examined. These results suggest (1) that the 5' flanking nucleotide sequence of GNB1 gene was strictly conserved for its possible contribution to the same change in the expression level between the mouse strains in response to psychostimulants and (2) that the initial process of development of behavioral sensitization appeared to occur parallel to the significant increase in the expression level of GNB1 gene in the mouse striatum. PMID:14631649

  4. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks. PMID:27326708

  5. Distinct gene expression patterns in skeletal and cardiac muscle are dependent on common regulatory sequences in the MLC1/3 locus.

    PubMed Central

    McGrew, M J; Bogdanova, N; Hasegawa, K; Hughes, S H; Kitsis, R N; Rosenthal, N

    1996-01-01

    The myosin light-chain 1/3 locus (MLC1/3) is regulated by two promoters and a downstream enhancer element which produce two protein isoforms in fast skeletal muscle at distinct stages of mouse embryogenesis. We have analyzed the expression of transcripts from the internal MLC3 promoter and determined that it is also expressed in the atria of the heart. Expression from the MLC3 promoter in these striated muscle lineages is differentially regulated during development. In transgenic mice, the MLC3 promoter is responsible for cardiac-specific reporter gene expression while the downstream enhancer augments expression in skeletal muscle. Examination of the methylation status of endogenous and transgenic promoter and enhancer elements indicates that the internal promoter is not regulated in a manner similar to that of the MLC1 promoter or the downstream enhancer. A GATA protein consensus sequence in the proximal MLC3 promoter but not the MLC1 promoter binds with high affinity to GATA-4, a cardiac muscle- and gut-specific transcription factor. Mutation of either the MEF2 or GATA motifs in the MLC3 promoter attenuates its activity in both heart and skeletal muscles, demonstrating that MLC3 expression in these two diverse muscle types is dependent on common regulatory elements. PMID:8754853

  6. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  7. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  8. Deep transcriptome sequencing reveals the expression of key functional and regulatory genes involved in the abiotic stress signaling pathways in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Drought, salt and cold are the major abiotic stresses that limit the rice production and cause serious threat to food security. The identification of the key functional and regulatory genes in the abiotic stress signaling pathways is important for understanding the molecular basis of abiotic stress ...

  9. Complementation of nitrogen-regulatory (ntr-like) mutations in Rhodobacter capsulatus by an Escherichia coli gene: cloning and sequencing of the gene and characterization of the gene product.

    PubMed Central

    Allibert, P; Willison, J C; Vignais, P M

    1987-01-01

    In vivo genetic engineering by R' plasmid formation was used to isolate an Escherichia coli gene that restored the Ntr+ phenotype to Ntr- mutants of the photosynthetic bacterium Rhodobacter capsulatus (formerly Rhodopseudomonas capsulata; J. F. Imhoff, H. G. Trüper, and N. Pfenning, Int. J. Syst. Bacteriol. 34:340-343, 1984). Nucleotide sequencing of the gene revealed no homology to the ntr genes of Klebsiella pneumoniae. Furthermore, hybridization experiments between the cloned gene and different F' plasmids indicated that the gene is located between 34 and 39 min on the E. coli genetic map and is therefore unlinked to the known ntr genes. The molecular weight of the gene product, deduced from the nucleotide sequence, was 30,563. After the gene was cloned in an expression vector, the gene product was purified. It was shown to have a pI of 5.8 and to behave as a dimer during gel filtration and on sucrose density gradients. Antibodies raised against the purified protein revealed the presence of this protein in R. capsulatus strains containing the E. coli gene, but not in other strains. Moreover, elimination of the plasmid carrying the E. coli gene from complemented strains resulted in the loss of the Ntr+ phenotype. Complementation of the R. capsulatus mutations by the E. coli gene therefore occurs in trans and results from the synthesis of a functional gene product. Images PMID:3025172

  10. The complete sequence of the human CD79b (Ig{beta}/B29) gene: Identification of a conserved exon/intron organization, immunoglobulin-like regulatory regions, and allelic polymorphism

    SciTech Connect

    Hashimoto, S.; Chiorazzi, N.; Gregersen, P.K. |

    1994-12-31

    We determined the complete genomic sequence of the human CD79b (Ig{beta}/B29) gene. The CD79b gene product is associated with the membrane immunoglobulin signaling complex which is composed of immunoglobulin (Ig) itself, associated in a noncovalent fashion with CD79b and a second polypeptide chain, CD79a (Ig{alpha}/mb1). The sequence and exon/intron organization of the human and mouse CD79b genes are highly similar. The gene organization suggests that some variant forms of CD79b may arise by virtue of alternative splicing of mRNA. In addition, a number of conserved regulatory sequences commonly found in Ig genes are present in sequences which flank the human CD79b gene. Some of these sequences are distinct from those found in the CD79a promoter. These differences may explain why transcription of CD79b, but not CD79a, is observed in plasma cells. A new Taq 1 restriction fragment length polymorphism is described that is not associated with any structural polymorphisms of the expressed CD79b polypeptide. 13 refs., 3 figs., 1 tab.

  11. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  12. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis1[OPEN

    PubMed Central

    Guo, Xu Qiu; Adams, Keith L.

    2015-01-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. PMID:26474639

  13. Transcription factor trapping by RNA in gene regulatory elements

    PubMed Central

    Sigova, Alla A.; Abraham, Brian J.; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M.; Eric Guo, Yang; Jangi, Mohini; Giallourakis, Cosmas C.; Sharp, Phillip A.; Young, Richard A.

    2016-01-01

    Transcription factors (TFs) bind specific sequences in promoter-proximal and distal DNA elements in order to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA-binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF YY1 binds to both gene regulatory elements and also to their associated RNA species genome-wide. Reduced transcription of regulatory elements diminishes YY1 occupancy whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive feedback loop that contributes to the stability of gene expression programs. PMID:26516199

  14. Transcription factor trapping by RNA in gene regulatory elements.

    PubMed

    Sigova, Alla A; Abraham, Brian J; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M; Guo, Yang Eric; Jangi, Mohini; Giallourakis, Cosmas C; Sharp, Phillip A; Young, Richard A

    2015-11-20

    Transcription factors (TFs) bind specific sequences in promoter-proximal and -distal DNA elements to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF Yin-Yang 1 (YY1) binds to both gene regulatory elements and their associated RNA species across the entire genome. Reduced transcription of regulatory elements diminishes YY1 occupancy, whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive-feedback loop that contributes to the stability of gene expression programs. PMID:26516199

  15. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    SciTech Connect

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  16. Interrogating Transcriptional Regulatory Sequences in Tol2-Mediated Xenopus Transgenics

    PubMed Central

    Loots, Gabriela G.; Bergmann, Anne; Hum, Nicholas R.; Oldenburg, Catherine E.; Wills, Andrea E.; Hu, Na; Ovcharenko, Ivan; Harland, Richard M.

    2013-01-01

    Identifying gene regulatory elements and their target genes in vertebrates remains a significant challenge. It is now recognized that transcriptional regulatory sequences are critical in orchestrating dynamic controls of tissue-specific gene expression during vertebrate development and in adult tissues, and that these elements can be positioned at great distances in relation to the promoters of the genes they control. While significant progress has been made in mapping DNA binding regions by combining chromatin immunoprecipitation and next generation sequencing, functional validation remains a limiting step in improving our ability to correlate in silico predictions with biological function. We recently developed a computational method that synergistically combines genome-wide gene-expression profiling, vertebrate genome comparisons, and transcription factor binding-site analysis to predict tissue-specific enhancers in the human genome. We applied this method to 270 genes highly expressed in skeletal muscle and predicted 190 putative cis-regulatory modules. Furthermore, we optimized Tol2 transgenic constructs in Xenopus laevis to interrogate 20 of these elements for their ability to function as skeletal muscle-specific transcriptional enhancers during embryonic development. We found 45% of these elements expressed only in the fast muscle fibers that are oriented in highly organized chevrons in the Xenopus laevis tadpole. Transcription factor binding site analysis identified >2 Mef2/MyoD sites within ∼200 bp regions in 6 of the validated enhancers, and systematic mutagenesis of these sites revealed that they are critical for the enhancer function. The data described herein introduces a new reporter system suitable for interrogating tissue-specific cis-regulatory elements which allows monitoring of enhancer activity in real time, throughout early stages of embryonic development, in Xenopus. PMID:23874664

  17. Characterization of DNA sequences that mediate nuclear protein binding to the regulatory region of the Pisum sativum (pea) chlorophyl a/b binding protein gene AB80: identification of a repeated heptamer motif.

    PubMed

    Argüello, G; García-Hernández, E; Sánchez, M; Gariglio, P; Herrera-Estrella, L; Simpson, J

    1992-05-01

    Two protein factors binding to the regulatory region of the pea chlorophyl a/b binding protein gene AB80 have been identified. One of these factors is found only in green tissue but not in etiolated or root tissue. The second factor (denominated ABF-2) binds to a DNA sequence element that contains a direct heptamer repeat TCTCAAA. It was found that presence of both of the repeats is essential for binding. ABF-2 is present in both green and etiolated tissue and in roots and factors analogous to ABF-2 are present in several plant species. Computer analysis showed that the TCTCAAA motif is present in the regulatory region of several plant genes. PMID:1303797

  18. Two regulatory proteins that bind to the basic transcription element (BTE), a GC box sequence in the promoter region of the rat P-4501A1 gene.

    PubMed Central

    Imataka, H; Sogawa, K; Yasumoto, K; Kikuchi, Y; Sasano, K; Kobayashi, A; Hayami, M; Fujii-Kuriyama, Y

    1992-01-01

    The cDNAs for two DNA binding proteins of BTE, a GC box sequence in the promoter region of the P-450IA1(CYP1A1) gene, have been isolated from a rat liver cDNA library by using the BTE sequence as a binding probe. While one is for the rat equivalent to human Sp1, the other encodes a primary structure of 244 amino acids, a novel DNA binding protein designated BTEB. Both proteins contain a zinc finger domain of Cys-Cys/His-His motif that is repeated three times with sequence similarity of 72% to each other, otherwise they share little or no similarity. The function of BTEB was analysed by transfection of plasmids expressing BTEB and/or Sp1 with appropriate reporter plasmids into a monkey cell line CV-1 and compared with Sp1. BTEB and Sp1 activated the expression of genes with repeated GC box sequences in promoters such as the simian virus 40 early promoter and the human immunodeficiency virus-1 long terminal repeat promoter. In contrast, BTEB repressed the activity of a promoter containing BTE, a single GC box of the CYP1A1 gene that is stimulated by Sp1. When the BTE sequence was repeated five times, however, BTEB turned out to be an activator of the promoter. RNA blot analysis showed that mRNAs for BTEB and Sp1 were expressed in all tissues tested, but their concentrations varied independently in tissues. The former mRNA was rich in the brain, kidney, lung and testis, while the latter was relatively abundant in the thymus and spleen.(ABSTRACT TRUNCATED AT 250 WORDS) Images PMID:1356762

  19. Modeling of hysteresis in gene regulatory networks.

    PubMed

    Hu, J; Qin, K R; Xiang, C; Lee, T H

    2012-08-01

    Hysteresis, observed in many gene regulatory networks, has a pivotal impact on biological systems, which enhances the robustness of cell functions. In this paper, a general model is proposed to describe the hysteretic gene regulatory network by combining the hysteresis component and the transient dynamics. The Bouc-Wen hysteresis model is modified to describe the hysteresis component in the mammalian gene regulatory networks. Rigorous mathematical analysis on the dynamical properties of the model is presented to ensure the bounded-input-bounded-output (BIBO) stability and demonstrates that the original Bouc-Wen model can only generate a clockwise hysteresis loop while the modified model can describe both clockwise and counter clockwise hysteresis loops. Simulation studies have shown that the hysteresis loops from our model are consistent with the experimental observations in three mammalian gene regulatory networks and two E.coli gene regulatory networks, which demonstrate the ability and accuracy of the mathematical model to emulate natural gene expression behavior with hysteresis. A comparison study has also been conducted to show that this model fits the experiment data significantly better than previous ones in the literature. The successful modeling of the hysteresis in all the five hysteretic gene regulatory networks suggests that the new model has the potential to be a unified framework for modeling hysteresis in gene regulatory networks and provide better understanding of the general mechanism that drives the hysteretic function. PMID:22588784

  20. Evolving Robust Gene Regulatory Networks

    PubMed Central

    Noman, Nasimul; Monjo, Taku; Moscato, Pablo; Iba, Hitoshi

    2015-01-01

    Design and implementation of robust network modules is essential for construction of complex biological systems through hierarchical assembly of ‘parts’ and ‘devices’. The robustness of gene regulatory networks (GRNs) is ascribed chiefly to the underlying topology. The automatic designing capability of GRN topology that can exhibit robust behavior can dramatically change the current practice in synthetic biology. A recent study shows that Darwinian evolution can gradually develop higher topological robustness. Subsequently, this work presents an evolutionary algorithm that simulates natural evolution in silico, for identifying network topologies that are robust to perturbations. We present a Monte Carlo based method for quantifying topological robustness and designed a fitness approximation approach for efficient calculation of topological robustness which is computationally very intensive. The proposed framework was verified using two classic GRN behaviors: oscillation and bistability, although the framework is generalized for evolving other types of responses. The algorithm identified robust GRN architectures which were verified using different analysis and comparison. Analysis of the results also shed light on the relationship among robustness, cooperativity and complexity. This study also shows that nature has already evolved very robust architectures for its crucial systems; hence simulation of this natural process can be very valuable for designing robust biological systems. PMID:25616055

  1. Comparative studies of gene regulatory mechanisms.

    PubMed

    Pai, Athma A; Gilad, Yoav

    2014-12-01

    It has become increasingly clear that changes in gene regulation have played an important role in adaptive evolution both between and within species. Over the past five years, comparative studies have moved beyond simple characterizations of differences in gene expression levels within and between species to studying variation in regulatory mechanisms. We still know relatively little about the precise chain of events that lead to most regulatory adaptations, but we have taken significant steps towards understanding the relative importance of changes in different mechanisms of gene regulatory evolution. In this review, we first discuss insights from comparative studies in model organisms, where the available experimental toolkit is extensive. We then focus on a few recent comparative studies in primates, where the limited feasibility of experimental manipulation dictates the approaches that can be used to study gene regulatory evolution. PMID:25215415

  2. Genetic relatedness of Clostridium difficile isolates from various origins determined by triple-locus sequence analysis based on toxin regulatory genes tcdC, tcdR, and cdtR.

    PubMed

    Bouvet, Philippe J M; Popoff, Michel R

    2008-11-01

    A triple-locus nucleotide sequence analysis based on toxin regulatory genes tcdC, tcdR and cdtR was initiated to assess the sequence variability of these genes among Clostridium difficile isolates and to study the genetic relatedness between isolates. A preliminary investigation of the variability of the tcdC gene was done with 57 clinical and veterinary isolates. Twenty-three isolates representing nine main clusters were selected for tcdC, tcdR, and cdtR analysis. The numbers of alleles found for tcdC, tcdR and cdtR were nine, six, and five, respectively. All strains possessed the cdtR gene except toxin A-negative toxin B-positive variants. All but one binary toxin CDT-positive isolate harbored a deletion (>1 bp) in the tcdC gene. The combined analyses of the three genes allowed us to distinguish five lineages correlated with the different types of deletion in tcdC, i.e., 18 bp (associated or not with a deletion at position 117), 36 bp, 39 bp, and 54 bp, and with the wild-type tcdC (no deletion). The tcdR and tcdC genes, though located within the same pathogenicity locus, were found to have evolved separately. Coevolution of the three genes was noted only with strains harboring a 39-bp or a 54-bp deletion in tcdC that formed two homogeneous, separate divergent clusters. Our study supported the existence of the known clones (PCR ribotype 027 isolates and toxin A-negative toxin B-positive C. difficile variants) and evidence for clonality of isolates with a 39-bp deletion (toxinotype V, PCR ribotype 078) that are frequently isolated worldwide from human infections and from food animals. PMID:18832125

  3. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  4. Dynamic chromatin: the regulatory domain organization of eukaryotic gene loci.

    PubMed

    Bonifer, C; Hecht, A; Saueressig, H; Winter, D M; Sippel, A E

    1991-10-01

    It is hypothesized that nuclear DNA is organized in topologically constrained loop domains defining basic units of higher order chromatin structure. Our studies are performed in order to investigate the functional relevance of this structural subdivision of eukaryotic chromatin for the control of gene expression. We used the chicken lysozyme gene locus as a model to examine the relation between chromatin structure and gene function. Several structural features of the lysozyme locus are known: the extension of the region of general DNAasel sensitivity of the active gene, the location of DNA-sequences with high affinity for the nuclear matrix in vitro, and the position of DNAasel hypersensitive chromatin sites (DHSs). The pattern of DHSs changes depending on the transcriptional status of the gene. Functional studies demonstrated that DHSs mark the position of cis-acting regulatory elements. Additionally, we discovered a novel cis-activity of the border regions of the DNAasel sensitive domain (A-elements). By eliminating the position effect on gene expression usually observed when genes are randomly integrated into the genome after transfection, A-elements possibly serve as punctuation marks for a regulatory chromatin domain. Experiments using transgenic mice confirmed that the complete structurally defined lysozyme gene domain behaves as an independent regulatory unit, expressing the gene in a tissue specific and position independent manner. These expression features were lost in transgenic mice carrying a construct, in which the A-elements as well as an upstream enhancer region were deleted, indicating the lack of a locus activation function on this construct. Experiments are designed in order to uncover possible hierarchical relationships between the different cis-acting regulatory elements for stepwise gene activation during cell differentiation. We are aiming at the definition of the basic structural and functional requirements for position independent and high

  5. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila

    PubMed Central

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R.

    2016-01-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  6. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila.

    PubMed

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R

    2016-09-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  7. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  8. The distribution of SNPs in human gene regulatory regions

    PubMed Central

    Guo, Yongjian; Jamison, D Curtis

    2005-01-01

    Background As a result of high-throughput genotyping methods, millions of human genetic variants have been reported in recent years. To efficiently identify those with significant biological functions, a practical strategy is to concentrate on variants located in important sequence regions such as gene regulatory regions. Results Analysis of the most common type of variant, single nucleotide polymorphisms (SNPs), shows that in gene promoter regions more SNPs occur in close proximity to transcriptional start sites than in regions further upstream, and a disproportionate number of those SNPs represent nucleotide transversions. Additionally, the number of SNPs found in the predicted transcription factor binding sites is higher than in non-binding site sequences. Conclusion Current information about transcription factor binding site sequence patterns may not be exhaustive, and SNPs may be actively involved in influencing gene expression by affecting the transcription factor binding sites. PMID:16209714

  9. Bioinformatic identification of novel regulatory DNA sequence motifs in Streptomyces coelicolor

    PubMed Central

    Studholme, David J; Bentley, Stephen D; Kormanec, Jan

    2004-01-01

    Background Streptomyces coelicolor is a bacterium with a vast repertoire of metabolic functions and complex systems of cellular development. Its genome sequence is rich in genes that encode regulatory proteins to control these processes in response to its changing environment. We wished to apply a recently published bioinformatic method for identifying novel regulatory sequence signals to gain new insights into regulation in S. coelicolor. Results The method involved production of position-specific weight matrices from alignments of over-represented words of DNA sequence. We generated 2497 weight matrices, each representing a candidate regulatory DNA sequence motif. We scanned the genome sequence of S. coelicolor against each of these matrices. A DNA sequence motif represented by one of the matrices was found preferentially in non-coding sequences immediately upstream of genes involved in polysaccharide degradation, including several that encode chitinases. This motif (TGGTCTAGACCA) was also found upstream of genes encoding components of the phosphoenolpyruvate phosphotransfer system (PTS). We hypothesise that this DNA sequence motif represents a regulatory element that is responsive to availability of carbon-sources. Other motifs of potential biological significance were found upstream of genes implicated in secondary metabolism (TTAGGTtAGgCTaACCTAA), sigma factors (TGACN19TGAC), DNA replication and repair (ttgtCAGTGN13TGGA), nucleotide conversions (CTACgcNCGTAG), and ArsR (TCAGN12TCAG). A motif found upstream of genes involved in chromosome replication (TGTCagtgcN7Tagg) was similar to a previously described motif found in UV-responsive promoters. Conclusions We successfully applied a recently published in silico method to identify conserved sequence motifs in S. coelicolor that may be biologically significant as regulatory elements. Our data are broadly consistent with and further extend data from previously published studies. We invite experimental testing of

  10. Consensus gene regulatory networks: combining multiple microarray gene expression datasets

    NASA Astrophysics Data System (ADS)

    Peeling, Emma; Tucker, Allan

    2007-09-01

    In this paper we present a method for modelling gene regulatory networks by forming a consensus Bayesian network model from multiple microarray gene expression datasets. Our method is based on combining Bayesian network graph topologies and does not require any special pre-processing of the datasets, such as re-normalisation. We evaluate our method on a synthetic regulatory network and part of the yeast heat-shock response regulatory network using publicly available yeast microarray datasets. Results are promising; the consensus networks formed provide a broader view of the potential underlying network, obtaining an increased true positive rate over networks constructed from a single data source.

  11. Massive contribution of transposable elements to mammalian regulatory sequences.

    PubMed

    Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam

    2016-09-01

    Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes. PMID:27174439

  12. Latent phenotypes pervade gene regulatory circuits

    PubMed Central

    2014-01-01

    Background Latent phenotypes are non-adaptive byproducts of adaptive phenotypes. They exist in biological systems as different as promiscuous enzymes and genome-scale metabolic reaction networks, and can give rise to evolutionary adaptations and innovations. We know little about their prevalence in the gene expression phenotypes of regulatory circuits, important sources of evolutionary innovations. Results Here, we study a space of more than sixteen million three-gene model regulatory circuits, where each circuit is represented by a genotype, and has one or more functions embodied in one or more gene expression phenotypes. We find that the majority of circuits with single functions have latent expression phenotypes. Moreover, the set of circuits with a given spectrum of functions has a repertoire of latent phenotypes that is much larger than that of any one circuit. Most of this latent repertoire can be easily accessed through a series of small genetic changes that preserve a circuit’s main functions. Both circuits and gene expression phenotypes that are robust to genetic change are associated with a greater number of latent phenotypes. Conclusions Our observations suggest that latent phenotypes are pervasive in regulatory circuits, and may thus be an important source of evolutionary adaptations and innovations involving gene regulation. PMID:24884746

  13. Developmental cis-regulatory analysis of the cyclin D gene in the sea urchin Strongylocentrotus purpuratus

    PubMed Central

    McCarty, Christopher M.

    2013-01-01

    Cyclin D genes regulate the cell cycle, growth and differentiation in response to intercellular signaling. While the promoters of vertebrate cyclin D genes have been analyzed, the cis-regulatory sequences across an entire cyclin D locus have not. Doing so would increase understanding of how cyclin D genes respond to the regulatory states established by developmental gene regulatory networks, linking cell cycle and growth control to the ontogenetic program. Therefore, we conducted a cis-regulatory analysis on the cyclin D gene, SpcycD, of the sea urchin, Strongylocentrotus purpuratus, during embryogenesis, identifying upstream and intronic sequences, located within six defined regions bearing one or more cis-regulatory modules each. PMID:24090975

  14. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. PMID:27241757

  15. Gene regulatory networks and the underlying biology of developmental toxicity

    EPA Science Inventory

    Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...

  16. Organization and sequence of the human alpha-lactalbumin gene.

    PubMed Central

    Hall, L; Emery, D C; Davies, M S; Parker, D; Craig, R K

    1987-01-01

    A recombinant bacteriophage containing the entire alpha-lactalbumin gene was isolated from a human genomic library constructed in bacteriophage lambda L47. Within this recombinant the 2.5 kb alpha-lactalbumin gene is flanked by about 5 kb of sequence on either side. The complete nucleotide sequence of the gene and its immediate flanking sequences were determined and compared with those of the rat alpha-lactalbumin gene. These studies showed that the size, organization and sequence of the exons have been highly conserved, whereas the introns have diverged considerably. In particular, the first intron of the human gene was found to contain an Alu repetitive sequence not present in the rat. A high degree of homology (67%) was also observed in the 5' flanking regions, extending as far as 655 nucleotide residues upstream of the transcriptional initiation site. Comparison of the 5' flanking sequences of these two alpha-lactalbumin genes with those of five casein genes has revealed the presence of a highly conserved region [consensus sequence: RGAAGRAAA(N)TGGACAGAAATCAA(CG)TTTCTA], extending from position -140 to -110 in all seven sequences examined, suggesting a possible regulatory role in the hormonal control or tissue-specific expression of milk protein genes in the mammary gland. Images Fig. 1. PMID:2954544

  17. Marine organism cell biology and regulatory sequence discoveryin comparative functional genomics.

    PubMed

    Barnes, David W; Mattingly, Carolyn J; Parton, Angela; Dowell, Lori M; Bayne, Christopher J; Forrest, John N

    2004-10-01

    The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure-function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5'-upstream, 3'-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention. PMID:19003267

  18. Mutational Robustness of Gene Regulatory Networks

    PubMed Central

    van Dijk, Aalt D. J.; van Mourik, Simon; van Ham, Roeland C. H. J.

    2012-01-01

    Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor – target gene interactions) but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive). In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence. PMID:22295094

  19. Autonomous Boolean modeling of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Socolar, Joshua; Sun, Mengyang; Cheng, Xianrui

    2014-03-01

    In cases where the dynamical properties of gene regulatory networks are important, a faithful model must include three key features: a network topology; a functional response of each element to its inputs; and timing information about the transmission of signals across network links. Autonomous Boolean network (ABN) models are efficient representations of these elements and are amenable to analysis. We present an ABN model of the gene regulatory network governing cell fate specification in the early sea urchin embryo, which must generate three bands of distinct tissue types after several cell divisions, beginning from an initial condition with only two distinct cell types. Analysis of the spatial patterning problem and the dynamics of a network constructed from available experimental results reveals that a simple mechanism is at work in this case. Supported by NSF Grant DMS-10-68602

  20. Stabilizing gene regulatory networks through feedforward loops

    NASA Astrophysics Data System (ADS)

    Kadelka, C.; Murrugarra, D.; Laubenbacher, R.

    2013-06-01

    The global dynamics of gene regulatory networks are known to show robustness to perturbations in the form of intrinsic and extrinsic noise, as well as mutations of individual genes. One molecular mechanism underlying this robustness has been identified as the action of so-called microRNAs that operate via feedforward loops. We present results of a computational study, using the modeling framework of stochastic Boolean networks, which explores the role that such network motifs play in stabilizing global dynamics. The paper introduces a new measure for the stability of stochastic networks. The results show that certain types of feedforward loops do indeed buffer the network against stochastic effects.

  1. Inference of Splicing Regulatory Activities by Sequence Neighborhood Analysis

    PubMed Central

    Stadler, Michael B; Shomron, Noam; Yeo, Gene W; Schneider, Aniket; Xiao, Xinshu; Burge, Christopher B

    2006-01-01

    Sequence-specific recognition of nucleic-acid motifs is critical to many cellular processes. We have developed a new and general method called Neighborhood Inference (NI) that predicts sequences with activity in regulating a biochemical process based on the local density of known sites in sequence space. Applied to the problem of RNA splicing regulation, NI was used to predict hundreds of new exonic splicing enhancer (ESE) and silencer (ESS) hexanucleotides from known human ESEs and ESSs. These predictions were supported by cross-validation analysis, by analysis of published splicing regulatory activity data, by sequence-conservation analysis, and by measurement of the splicing regulatory activity of 24 novel predicted ESEs, ESSs, and neutral sequences using an in vivo splicing reporter assay. These results demonstrate the ability of NI to accurately predict splicing regulatory activity and show that the scope of exonic splicing regulatory elements is substantially larger than previously anticipated. Analysis of orthologous exons in four mammals showed that the NI score of ESEs, a measure of function, is much more highly conserved above background than ESE primary sequence. This observation indicates a high degree of selection for ESE activity in mammalian exons, with surprisingly frequent interchangeability between ESE sequences. PMID:17121466

  2. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks

    PubMed Central

    Singhal, Amit; Kumar, Pavanish; de Libero, Gennaro; Poidinger, Michael; Monterola, Christopher

    2015-01-01

    Human gene regulatory networks (GRN) can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs). Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data) accompanying this manuscript. PMID:26393364

  3. Beyond antioxidant genes in the ancient NRF2 regulatory network

    PubMed Central

    Lacher, Sarah E.; Lee, Joslynn S.; Wang, Xuting; Campbell, Michelle R.; Bell, Douglas A.; Slattery, Matthew

    2016-01-01

    NRF2, a basic leucine zipper transcription factor encoded by the gene NFE2L2, is a master regulator of the transcriptional response to oxidative stress. NRF2 is structurally and functionally conserved from insects to humans, and it heterodimerizes with the small MAF transcription factors to bind a consensus DNA sequence (the antioxidant response element, or ARE) and regulate gene expression. We have used genome-wide chromatin immunoprecipitation (ChIP-seq) and gene expression data to identify direct NRF2 target genes in Drosophila and humans. These data have allowed us to construct the deeply conserved ancient NRF2 regulatory network – target genes that are conserved from Drosophila to human. The ancient network consists of canonical antioxidant genes, as well as genes related to proteasomal pathways, metabolism, and a number of less expected genes. We have also used enhancer reporter assays and electrophoretic mobility shift assays to confirm NRF2-mediated regulation of ARE (antioxidant response element) activity at a number of these novel target genes. Interestingly, the ancient network also highlights a prominent negative feedback loop; this, combined with the finding that and NRF2-mediated regulatory output is tightly linked to the quality of the ARE it is targeting, suggests that precise regulation of nuclear NRF2 concentration is necessary to achieve proper quantitative regulation of distinct gene sets. Together, these findings highlight the importance of balance in the NRF2-ARE pathway, and indicate that NRF2-mediated regulation of xenobiotic metabolism, glucose metabolism, and proteostasis have been central to this pathway since its inception. PMID:26163000

  4. Repetitive sequence environment distinguishes housekeeping genes

    PubMed Central

    Eller, C. Daniel; Regelson, Moira; Merriman, Barry; Nelson, Stan; Horvath, Steve; Marahrens, York

    2007-01-01

    Housekeeping genes are expressed across a wide variety of tissues. Since repetitive sequences have been reported to influence the expression of individual genes, we employed a novel approach to determine whether housekeeping genes can be distinguished from tissue-specific genes their repetitive sequence context. We show that Alu elements are more highly concentrated around housekeeping genes while various longer (>400-bp) repetitive sequences ("repeats"), including Long Interspersed Nuclear Element 1 (LINE-1) elements, are excluded from these regions. We further show that isochore membership does not distinguish housekeeping genes from tissue-specific genes and that repetitive sequence environment distinguishes housekeeping genes from tissue-specific genes in every isochore. The distinct repetitive sequence environment, in combination with other previously published sequence properties of housekeeping genes, were used to develop a method of predicting housekeeping genes on the basis of DNA sequence alone. Using expression across tissue types as a measure of success, we demonstrate that repetitive sequence environment is by far the most important sequence feature identified to date for distinguishing housekeeping genes. PMID:17141428

  5. Integrating heterogeneous gene expression data for gene regulatory network modelling.

    PubMed

    Sîrbu, Alina; Ruskin, Heather J; Crane, Martin

    2012-06-01

    Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets. PMID:21948152

  6. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    SciTech Connect

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A. . E-mail: BELL1@niehs.nih.gov

    2005-09-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes.

  7. A Provisional Gene Regulatory Atlas for Mouse Heart Development

    PubMed Central

    Chen, Hailin; VanBuren, Vincent

    2014-01-01

    Congenital Heart Disease (CHD) is one of the most common birth defects. Elucidating the molecular mechanisms underlying normal cardiac development is an important step towards early identification of abnormalities during the developmental program and towards the creation of early intervention strategies. We developed a novel computational strategy for leveraging high-content data sets, including a large selection of microarray data associated with mouse cardiac development, mouse genome sequence, ChIP-seq data of selected mouse transcription factors and Y2H data of mouse protein-protein interactions, to infer the active transcriptional regulatory network of mouse cardiac development. We identified phase-specific expression activity for 765 overlapping gene co-expression modules that were defined for obtained cardiac lineage microarray data. For each co-expression module, we identified the phase of cardiac development where gene expression for that module was higher than other phases. Co-expression modules were found to be consistent with biological pathway knowledge in Wikipathways, and met expectations for enrichment of pathways involved in heart lineage development. Over 359,000 transcription factor-target relationships were inferred by analyzing the promoter sequences within each gene module for overrepresentation against the JASPAR database of Transcription Factor Binding Site (TFBS) motifs. The provisional regulatory network will provide a framework of studying the genetic basis of CHD. PMID:24421884

  8. Classification of Arabidopsis thaliana gene sequences: clustering of coding sequences into two groups according to codon usage improves gene prediction.

    PubMed

    Mathé, C; Peresetsky, A; Déhais, P; Van Montagu, M; Rouzé, P

    1999-02-01

    While genomic sequences are accumulating, finding the location of the genes remains a major issue that can be solved only for about a half of them by homology searches. Prediction methods are thus required, but unfortunately are not fully satisfying. Most prediction methods implicitly assume a unique model for genes. This is an oversimplification as demonstrated by the possibility to group coding sequences into several classes in Escherichia coli and other genomes. As no classification existed for Arabidopsis thaliana, we classified genes according to the statistical features of their coding sequences. A clustering algorithm using a codon usage model was developed and applied to coding sequences from A. thaliana, E. coli, and a mixture of both. By using it, Arabidopsis sequences were clustered into two classes. The CU1 and CU2 classes differed essentially by the choice of pyrimidine bases at the codon silent sites: CU2 genes often use C whereas CU1 genes prefer T. This classification discriminated the Arabidopsis genes according to their expressiveness, highly expressed genes being clustered in CU2 and genes expected to have a lower expression, such as the regulatory genes, in CU1. The algorithm separated the sequences of the Escherichia-Arabidopsis mixed data set into five classes according to the species, except for one class. This mixed class contained 89 % Arabidopsis genes from CU1 and 11 % E. coli genes, mostly horizontally transferred. Interestingly, most genes encoding organelle-targeted proteins, except the photosynthetic and photoassimilatory ones, were clustered in CU1. By tailoring the GeneMark CDS prediction algorithm to the observed coding sequence classes, its quality of prediction was greatly improved. Similar improvement can be expected with other prediction systems. PMID:9925779

  9. Generation of oscillating gene regulatory network motifs

    NASA Astrophysics Data System (ADS)

    van Dorp, M.; Lannoo, B.; Carlon, E.

    2013-07-01

    Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.

  10. Reverse engineering of gene regulatory networks.

    PubMed

    Cho, K H; Choo, S M; Jung, S H; Kim, J R; Choi, H S; Kim, J

    2007-05-01

    Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided. PMID:17591174

  11. The 5' regulatory sequence of the PMP22 in the patients with Charcot-Marie-Tooth disease.

    PubMed

    Sinkiewicz-Darol, Elena; Kabzińska, Dagmara; Moszyńska, Izabela; Kochański, Andrzej

    2010-01-01

    Little is known about the molecular background of clinical variability of Charcot-Marie-Tooth type 1A (CMT1A) disease and hereditary neuropathy with liability to pressure palsies (HNPP). The CMT1A and HNPP disorders result from duplication and deletion of the PMP22 gene respectively. In a series of studies performed on affected animal transgenic models of CMT1A disease, expression of the PMP22 gene (gene dosage) was shown to correlete with severity of CMT course (gene dosage effect). In this study we hypothesized that single nucleotide polymorphisms (SNPs) located within the 5' regulatory sequence of PMP22 gene may be responsible for the CMT1A/HNPP clinical variability. We have sequenced the PMP22 5' upstream regulatory sequence in a group of 45 CMT1A/HNPP patients harboring the PMP22 duplication (37) /deletion (8). We have identified five SNPs in the regulatory sequence of the PMP22 gene. Three of them i.e. -819C>T, -4785G>T, -4800C>T were detected both in the patients and in the control group. Thus, their pathogenic role in the regulation of the expression of the PMP22 gene seems not to be significant. Two SNPs i.e. -4210T>C and -4759T>A were found only in the CMT patients. Their role in the regulation of the PMP22 gene expression can not be excluded. Additionally we have detected the Thr118Met variant in exon 4 of the PMP22 gene, which was previously reported by other authors, in one patient. We conclude that the 5' regulatory sequence of the PMP22 gene is conserved at the nucleotiode level, however rarely occurring SNPs variant in the PMP22 regulatory sequence may be associated with the gene dosage effect. PMID:20842290

  12. [Identification and mapping of cis-regulatory elements within long genomic sequences].

    PubMed

    Akopov, S B; Chernov, I P; Vetchinova, A S; Bulanenkova, S S; Nikolaev, L G

    2007-01-01

    The publication of the human and other metazoan genome sequences opened up the possibility for mapping and analysis of genomic regulatory elements. Unfortunately, experimental data on genomic positions of such sequences as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. As most genomic regulatory elements (e.g., enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements in silico is often ambiguous. Therefore, the development of high-throughput experimental approaches for identification and mapping of genomic functional elements is highly desirable. In this review we discuss novel approaches to high-throughput experimental identification of mammalian genomes cis-regulatory elements which is a necessary step toward the complete genome annotation. PMID:18240562

  13. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  14. Gene Regulatory Networks Elucidating Huanglongbing Disease Mechanisms

    PubMed Central

    Martinelli, Federico; Reagan, Russell L.; Uratsu, Sandra L.; Phu, My L.; Albrecht, Ute; Zhao, Weixiang; Davis, Cristina E.; Bowman, Kim D.; Dandekar, Abhaya M.

    2013-01-01

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein – protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes) would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation), sucrose metabolism (upregulation), and starch biosynthesis (upregulation). In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70) was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur. PMID:24086326

  15. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence

    PubMed Central

    Kinney, Justin B.; Murugan, Anand; Callan, Curtis G.; Cox, Edward C.

    2010-01-01

    Cells use protein-DNA and protein-protein interactions to regulate transcription. A biophysical understanding of this process has, however, been limited by the lack of methods for quantitatively characterizing the interactions that occur at specific promoters and enhancers in living cells. Here we show how such biophysical information can be revealed by a simple experiment in which a library of partially mutated regulatory sequences are partitioned according to their in vivo transcriptional activities and then sequenced en masse. Computational analysis of the sequence data produced by this experiment can provide precise quantitative information about how the regulatory proteins at a specific arrangement of binding sites work together to regulate transcription. This ability to reliably extract precise information about regulatory biophysics in the face of experimental noise is made possible by a recently identified relationship between likelihood and mutual information. Applying our experimental and computational techniques to the Escherichia coli lac promoter, we demonstrate the ability to identify regulatory protein binding sites de novo, determine the sequence-dependent binding energy of the proteins that bind these sites, and, importantly, measure the in vivo interaction energy between RNA polymerase and a DNA-bound transcription factor. Our approach provides a generally applicable method for characterizing the biophysical basis of transcriptional regulation by a specified regulatory sequence. The principles of our method can also be applied to a wide range of other problems in molecular biology. PMID:20439748

  16. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar; Otim, Ochan; Brown, C. Titus; Livi, Carolina B.; Lee, Pei Yun; Revilla, Roger; Schilstra, Maria J.; Clarke, Peter J C.; Rust, Alistair G.; Pan, Zhengjun; Arnone, Maria I.; Rowen, Lee; Cameron, R. Andrew; McClay, David R.; Hood, Leroy; Bolouri, Hamid

    2002-01-01

    We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

  17. Genome-wide identification of conserved regulatory function in diverged sequences

    PubMed Central

    Taher, Leila; McGaughey, David M.; Maragh, Samantha; Aneas, Ivy; Bessling, Seneca L.; Miller, Webb; Nobrega, Marcelo A.; McCallion, Andrew S.; Ovcharenko, Ivan

    2011-01-01

    Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evolutionarily diverged loci by searching for conserved TFBS arrangements. With an estimated 5% false-positive rate (FPR) in approximately 3000 human and zebrafish syntenic loci, we detected approximately 300 pairs of diverged elements that are likely to share common ancestry and have similar regulatory activity. By analyzing a pool of experimentally validated human enhancers, we demonstrated that 7/8 (88%) of their predicted functional orthologs retained in vivo regulatory control. Moreover, in 5/7 (71%) of assayed enhancer pairs, we observed concordant expression patterns. We argue that TFBS composition is often necessary to retain and sufficient to predict regulatory function in the absence of overt sequence conservation, revealing an entire class of functionally conserved, evolutionarily diverged regulatory elements that we term “covert.” PMID:21628450

  18. Regulatory gene networks and the properties of the developmental process

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; McClay, David R.; Hood, Leroy

    2003-01-01

    Genomic instructions for development are encoded in arrays of regulatory DNA. These specify large networks of interactions among genes producing transcription factors and signaling components. The architecture of such networks both explains and predicts developmental phenomenology. Although network analysis is yet in its early stages, some fundamental commonalities are already emerging. Two such are the use of multigenic feedback loops to ensure the progressivity of developmental regulatory states and the prevalence of repressive regulatory interactions in spatial control processes. Gene regulatory networks make it possible to explain the process of development in causal terms and eventually will enable the redesign of developmental regulatory circuitry to achieve different outcomes.

  19. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  20. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA.

    PubMed

    Turner, Tychele N; Hormozdiari, Fereydoun; Duyzend, Michael H; McClymont, Sarah A; Hook, Paul W; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A; Zody, Michael C; Nelson, Bradley J; Huddleston, John; Sandstrom, Richard; Smith, Joshua D; Hanna, David; Swanson, James M; Faustman, Elaine M; Bamshad, Michael J; Stamatoyannopoulos, John; Nickerson, Deborah A; McCallion, Andrew S; Darnell, Robert; Eichler, Evan E

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  1. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

    PubMed Central

    Sîrbu, Alina; Crane, Martin; Ruskin, Heather J.

    2015-01-01

    Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  2. Conserved Noncoding Sequences Highlight Shared Components of Regulatory Networks in Dicotyledonous Plants[W

    PubMed Central

    Baxter, Laura; Jironkin, Aleksey; Hickman, Richard; Moore, Jay; Barrington, Christopher; Krusche, Peter; Dyer, Nigel P.; Buchanan-Wollaston, Vicky; Tiskin, Alexander; Beynon, Jim; Denby, Katherine; Ott, Sascha

    2012-01-01

    Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research. PMID:23110901

  3. Regulatory Elements of the Floral Homeotic Gene AGAMOUS Identified by Phylogenetic Footprinting and ShadowingW⃞

    PubMed Central

    Hong, Ray L.; Hamaguchi, Lynn; Busch, Maximilian A.; Weigel, Detlef

    2003-01-01

    In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3-kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae species, several other motifs, but not the LFY and WUS binding sites identified previously, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for the activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection but also demonstrate that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites. PMID:12782724

  4. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  5. Preservation of Gene Duplication Increases the Regulatory Spectrum of Ribosomal Protein Genes and Enhances Growth under Stress.

    PubMed

    Parenteau, Julie; Lavoie, Mathieu; Catala, Mathieu; Malik-Ghulam, Mustafa; Gagnon, Jules; Abou Elela, Sherif

    2015-12-22

    In baker's yeast, the majority of ribosomal protein genes (RPGs) are duplicated, and it was recently proposed that such duplications are preserved via the functional specialization of the duplicated genes. However, the origin and nature of duplicated RPGs' (dRPGs) functional specificity remain unclear. In this study, we show that differences in dRPG functions are generated by variations in the modality of gene expression and, to a lesser extent, by protein sequence. Analysis of the sequence and expression patterns of non-intron-containing RPGs indicates that each dRPG is controlled by specific regulatory sequences modulating its expression levels in response to changing growth conditions. Homogenization of dRPG sequences reduces cell tolerance to growth under stress without changing the number of expressed genes. Together, the data reveal a model where duplicated genes provide a means for modulating the expression of ribosomal proteins in response to stress. PMID:26686636

  6. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  7. Sequence and regulation of the porcine FSHR gene promoter.

    PubMed

    Wu, Wangjun; Han, Jing; Cao, Rui; Zhang, Jinbi; Li, Bojiang; Liu, Zequn; Liu, Kaiqing; Li, Qifa; Pan, Zengxiang; Chen, Jie; Liu, Honglin

    2015-03-01

    Follicle-stimulating hormone (FSH) plays a crucial role in animal reproduction and exerts its physiological functions by interacting with the FSH receptor (FSHR). The FSHR is exclusively expressed in granulose cells in the ovary and its expression level is closely related to granulose cell differentiation and follicle maturation. In mammal, most of the follicles undergo atresia, while follicle atresia is mainly caused by granulosa cell apoptosis. However, knowledge on the transcriptional regulatory mechanisms of the porcine FSHR gene in granulosa cell is still limited. In this study, approximately 2.1kb of the proximal promoter sequence of the porcine FSHR gene were obtained by genome walking, and the regulatory elements and transcription factors in the porcine FSHR promoter sequence were predicted. Furthermore, the core promoter region (-1195/-598) of the porcine FSHR gene was identified using a luciferase assay. Subsequently, the relationship between expression levels of the porcine FSHR gene and histone H3K9 acetylation levels around the core promoter region (-787/-572) in vivo and invitro were analyzed. Our results showed that an increased FSHR gene expression level was accompanied with an increase in histone H3K9 acetylation levels, suggesting that histone H3K9 acetylation could regulate the expression of the porcine FSHR gene. PMID:25599592

  8. Deduced products of C4-dicarboxylate transport regulatory genes of Rhizobium leguminosarum are homologous to nitrogen regulatory gene products.

    PubMed Central

    Ronson, C W; Astwood, P M; Nixon, B T; Ausubel, F M

    1987-01-01

    We have sequenced two genes dctB and dctD required for the activation of the C4-dicarboxylate transport structural gene dctA in free-living Rhizobium leguminosarum. The hydropathic profile of the dctB gene product (DctB) suggested that its N-terminal region may be located in the periplasm and its C-terminal region in the cytoplasm. The C-terminal region of DctB was strongly conserved with similar regions of the products of several regulatory genes that may act as environmental sensors, including ntrB, envZ, virA, phoR, cpxA, and phoM. The N-terminal domains of the products of several regulatory genes thought to be transcriptional activators, including ntrC, ompR, virG, phoB and sfrA. In addition, the central and C-terminal regions of DctD were strongly conserved with the products of ntrC and nifA, transcriptional activators that require the alternate sigma factor rpoN (ntrA) as co-activator. The central region of DctD also contained a potential ATP-binding domain. These results are consistent with recent results that show that rpoN product is required for dctA activation, and suggest that DctB plus DctD-mediated transcriptional activation of dctA may be mechanistically similar to NtrB plus NtrC-mediated activation of glnA in E. coli. PMID:3671068

  9. C. elegans Metabolic Gene Regulatory Networks Govern the Cellular Economy

    PubMed Central

    Watson, Emma; Walhout, Albertha J.M.

    2014-01-01

    Diet greatly impacts metabolism in health and disease. In response to the presence or absence of specific nutrients, metabolic gene regulatory networks sense the metabolic state of the cell and regulate metabolic flux accordingly, for instance by the transcriptional control of metabolic enzymes. Here we discuss recent insights regarding metazoan metabolic regulatory networks using the nematode Caenorhabditis elegans as a model, including the modular organization of metabolic gene regulatory networks, the prominent impact of diet on the transcriptome and metabolome, specialized roles of nuclear hormone receptors in responding to dietary conditions, regulation of metabolic genes and metabolic regulators by microRNAs, and feedback between metabolic genes and their regulators. PMID:24731597

  10. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  11. Genomic aberrations frequently alter chromatin regulatory genes in chordoma.

    PubMed

    Wang, Lu; Zehir, Ahmet; Nafa, Khedoudja; Zhou, Nengyi; Berger, Michael F; Casanova, Jacklyn; Sadowska, Justyna; Lu, Chao; Allis, C David; Gounder, Mrinal; Chandhanayingyong, Chandhanarat; Ladanyi, Marc; Boland, Patrick J; Hameed, Meera

    2016-07-01

    Chordoma is a rare primary bone neoplasm that is resistant to standard chemotherapies. Despite aggressive surgical management, local recurrence and metastasis is not uncommon. To identify the specific genetic aberrations that play key roles in chordoma pathogenesis, we utilized a genome-wide high-resolution SNP-array and next generation sequencing (NGS)-based molecular profiling platform to study 24 patient samples with typical histopathologic features of chordoma. Matching normal tissues were available for 16 samples. SNP-array analysis revealed nonrandom copy number losses across the genome, frequently involving 3, 9p, 1p, 14, 10, and 13. In contrast, copy number gain is uncommon in chordomas. Two minimum deleted regions were observed on 3p within a ∼8 Mb segment at 3p21.1-p21.31, which overlaps SETD2, BAP1 and PBRM1. The minimum deleted region on 9p was mapped to CDKN2A locus at 9p21.3, and homozygous deletion of CDKN2A was detected in 5/22 chordomas (∼23%). NGS-based molecular profiling demonstrated an extremely low level of mutation rate in chordomas, with an average of 0.5 mutations per sample for the 16 cases with matched normal. When the mutated genes were grouped based on molecular functions, many of the mutation events (∼40%) were found in chromatin regulatory genes. The combined copy number and mutation profiling revealed that SETD2 is the single gene affected most frequently in chordomas, either by deletion or by mutations. Our study demonstrated that chordoma belongs to the C-class (copy number changes) tumors whose oncogenic signature is non-random multiple copy number losses across the genome and genomic aberrations frequently alter chromatin regulatory genes. © 2016 Wiley Periodicals, Inc. PMID:27072194

  12. Intersecting transcription networks constrain gene regulatory evolution.

    PubMed

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-07-16

    Epistasis-the non-additive interactions between different genetic loci-constrains evolutionary pathways, blocking some and permitting others. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeast. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analysing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  13. Intersecting transcription networks constrain gene regulatory evolution

    PubMed Central

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-01-01

    Epistasis—the non-additive interactions between different genetic loci—constrains evolutionary pathways, blocking some and permitting others1–8. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeasts9. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analyzing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  14. The structure of the human peripherin gene (PRPH) and identification of potential regulatory elements

    SciTech Connect

    Foley, J.; Ley, C.A.; Parysek, L.M.

    1994-07-15

    The authors determined the complete nucleotide sequence of the coding region of the human peripherin gene (PRPH), as well as 742 bp 5{prime} to the cap site and 584 bp 3{prime} to the stop codon, and compared its structure and sequence to the rat and mouse genes. The overall structure of 9 exons separated by 8 introns is conserved among these three mammalian species. The nucleotide sequences of the human peripherin gene exons were 90% identical to the rat gene sequences, and the predicted human peripherin protein differed from rat peripherin at only 18 of 475 amino acid residues. Comparison of the 5{prime} flanking regions of the human peripherin gene and rodent genes revealed extensive areas of high homology. Additional conserved segments were found in introns 1 and 2. Within the 5{prime} region, potential regulatory sequences, including a nerve growth factor negative regulatory element, a Hox protein binding site, and a heat shock element, were identified in all peripherin genes. The positional conservation of each element suggests that they may be important in the tissue-specific, developmental-specific, and injury-specific expression of the peripherin gene. 24 refs., 2 figs., 1 tab.

  15. Gene Sequence Homology of Chemokines Across Species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-reactivities...

  16. GENE SEQUENCE HOMOLOGY OF CHEMOKINES ACROSS SPECIES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-react...

  17. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  18. Stress-induced endogenous siRNAs targeting regulatory intron sequences in Brachypodium

    PubMed Central

    Wang, Hsiao-Lin V.; Dinwiddie, Brandon L.; Lee, Herman

    2015-01-01

    Exposure to abiotic stresses triggers global changes in the expression of thousands of eukaryotic genes at the transcriptional and post-transcriptional levels. Small RNA (smRNA) pathways and splicing both function as crucial mechanisms regulating stress-responsive gene expression. However, examples of smRNAs regulating gene expression remain largely limited to effects on mRNA stability, translation, and epigenetic regulation. Also, our understanding of the networks controlling plant gene expression in response to environmental changes, and examples of these regulatory pathways intersecting, remains limited. Here, to investigate the role of smRNAs in stress responses we examined smRNA transcriptomes of Brachypodium distachyon plants subjected to various abiotic stresses. We found that exposure to different abiotic stresses specifically induced a group of novel, endogenous small interfering RNAs (stress-induced, UTR-derived siRNAs, or sutr-siRNAs) that originate from the 3′ UTRs of a subset of coding genes. Our bioinformatics analyses predicted that sutr-siRNAs have potential regulatory functions and that over 90% of sutr-siRNAs target intronic regions of many mRNAs in trans. Importantly, a subgroup of these sutr-siRNAs target the important intron regulatory regions, such as branch point sequences, that could affect splicing. Our study indicates that in Brachypodium, sutr-siRNAs may affect splicing by masking or changing accessibility of specific cis-elements through base-pairing interactions to mediate gene expression in response to stresses. We hypothesize that this mode of regulation of gene expression may also serve as a general mechanism for regulation of gene expression in plants and potentially in other eukaryotes. PMID:25480817

  19. Phenotype accessibility and noise in random threshold gene regulatory networks.

    PubMed

    Pinho, Ricardo; Garcia, Victor; Feldman, Marcus W

    2014-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  20. Phenotype Accessibility and Noise in Random Threshold Gene Regulatory Networks

    PubMed Central

    Feldman, Marcus W.

    2015-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  1. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations.

    PubMed

    Wang, Junbai; Batmanov, Kirill

    2015-12-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  2. Distinct Functional Constraints Partition Sequence Conservation in a cis-Regulatory Element

    PubMed Central

    Ruvinsky, Ilya

    2011-01-01

    Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. PMID:21655084

  3. Robustness and Accuracy in Sea Urchin Developmental Gene Regulatory Networks

    PubMed Central

    Ben-Tabou de-Leon, Smadar

    2016-01-01

    Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change. PMID:26913048

  4. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    PubMed Central

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2005-01-01

    We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085

  5. Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

    PubMed Central

    Noble, Luke M.; Andrianopoulos, Alex

    2013-01-01

    Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226

  6. DNA sequence of the yeast transketolase gene.

    PubMed

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  7. The nucleotide sequence of the mouse immunoglobulin epsilon gene: comparison with the human epsilon gene sequence.

    PubMed Central

    Ishida, N; Ueda, S; Hayashida, H; Miyata, T; Honjo, T

    1982-01-01

    We have determined the nucleotide sequence of the immunoglobulin epsilon gene cloned from newborn mouse DNA. The epsilon gene sequence allows prediction of the amino acid sequence of the constant region of the epsilon chain and comparison of it with sequences of the human epsilon and other mouse immunoglobulin genes. The epsilon gene was shown to be under the weakest selection pressure at the protein level among the immunoglobulin genes although the divergence at the synonymous position is similar. Our results suggest that the epsilon gene may be dispensable, which is in accord with the fact that IgE has only obscure roles in the immune defense system but has an undesirable role as a mediator of hypersensitivity. The sequence data suggest that the human and murine epsilon genes were derived from different ancestors duplicated a long time ago. The amino acid sequence of the epsilon chain is more homologous to those of the gamma chains than the other mouse heavy chains. Two membrane exons, separated by an 80-base intron, were identified 1.7 kb 3' to the CH4 domain of the epsilon gene and shown to conserve a hydrophobic portion similar to those of other heavy chain genes. RNA blot hybridization showed that the epsilon membrane exons are transcribed into two species of mRNA in an IgE hybridoma. Images Fig. 4. PMID:6329728

  8. The molecular and gene regulatory signature of a neuron

    PubMed Central

    Hobert, Oliver; Carrera, Inés; Stefanakis, Nikolaos

    2010-01-01

    Neuron-type specific gene batteries define the morphological and functional diversity of cell types in the nervous system. Here, we discuss the composition of neuron-type specific gene batteries and illustrate gene regulatory strategies employed by distinct organisms from C.elegans to higher vertebrates, which are instrumental in determining the unique gene expression profile and molecular composition of individual neuronal cell types. Based on principles learned from prokaryotic gene regulation, we argue that neuronal, terminal gene batteries are functionally grouped into parallel acting “regulons”. The theoretical concepts discussed here provide testable hypotheses for future experimental analysis into the exact gene regulatory mechanisms that are employed in the generation of neuronal diversity and identity. PMID:20663572

  9. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    SciTech Connect

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  10. Gene regulatory networks modelling using a dynamic evolutionary hybrid

    PubMed Central

    2010-01-01

    Background Inference of gene regulatory networks is a key goal in the quest for understanding fundamental cellular processes and revealing underlying relations among genes. With the availability of gene expression data, computational methods aiming at regulatory networks reconstruction are facing challenges posed by the data's high dimensionality, temporal dynamics or measurement noise. We propose an approach based on a novel multi-layer evolutionary trained neuro-fuzzy recurrent network (ENFRN) that is able to select potential regulators of target genes and describe their regulation type. Results The recurrent, self-organizing structure and evolutionary training of our network yield an optimized pool of regulatory relations, while its fuzzy nature avoids noise-related problems. Furthermore, we are able to assign scores for each regulation, highlighting the confidence in the retrieved relations. The approach was tested by applying it to several benchmark datasets of yeast, managing to acquire biologically validated relations among genes. Conclusions The results demonstrate the effectiveness of the ENFRN in retrieving biologically valid regulatory relations and providing meaningful insights for better understanding the dynamics of gene regulatory networks. The algorithms and methods described in this paper have been implemented in a Matlab toolbox and are available from: http://bioserver-1.bioacademy.gr/DataRepository/Project_ENFRN_GRN/. PMID:20298548

  11. Time-Delayed Models of Gene Regulatory Networks

    PubMed Central

    Parmar, K.; Blyuss, K. B.; Kyrychko, Y. N.; Hogan, S. J.

    2015-01-01

    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems. PMID:26576197

  12. Understanding the Role of Housekeeping and Stress-Related Genes in Transcription-Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Heath, Allison; Kavraki, Lydia; Balázsi, Gábor

    2008-03-01

    Despite the increasing number of completely sequenced genomes, much remains to be learned about how living cells process environmental information and respond to changes in their surroundings. Accumulating evidence indicates that eukaryotic and prokaryotic genes can be classified in two distinct categories that we will call class I and class II. Class I genes are housekeeping genes, often characterized by stable, noise resistant expression levels. In contrast, class II genes are stress-related genes and often have noisy, unstable expression levels. In this work we analyze the large scale transcription-regulatory networks (TRN) of E. coli and S. cerevisiae and preliminary data on H. sapien. We find that stable, housekeeping genes (class I) are preferentially utilized as transcriptional inputs while stress related, unstable genes (class II) are utilized as transcriptional integrators. This might be the result of convergent evolution that placed the appropriate genes in the appropriate locations within transcriptional networks according to some fundamental principles that govern cellular information processing.

  13. Bayesian Nonlinear Model Selection for Gene Regulatory Networks

    PubMed Central

    Ni, Yang; Stingo, Francesco C.; Baladandayuthapani, Veerabhadran

    2015-01-01

    Summary Gene regulatory networks represent the regulatory relationships between genes and their products and are important for exploring and defining the underlying biological processes of cellular systems. We develop a novel framework to recover the structure of nonlinear gene regulatory networks using semiparametric spline-based directed acyclic graphical models. Our use of splines allows the model to have both flexibility in capturing nonlinear dependencies as well as control of overfitting via shrinkage, using mixed model representations of penalized splines. We propose a novel discrete mixture prior on the smoothing parameter of the splines that allows for simultaneous selection of both linear and nonlinear functional relationships as well as inducing sparsity in the edge selection. Using simulation studies, we demonstrate the superior performance of our methods in comparison with several existing approaches in terms of network reconstruction and functional selection. We apply our methods to a gene expression dataset in glioblastoma multiforme, which reveals several interesting and biologically relevant nonlinear relationships. PMID:25854759

  14. Functional effects of a natural polymorphism in the transcriptional regulatory sequence of HLA-DQB1.

    PubMed Central

    Beaty, J S; West, K A; Nepom, G T

    1995-01-01

    DNA sequence polymorphism in the genes encoding HLA class II proteins accounts for allelic diversity in antigen recognition and presentation and, thus, in the role of these cell surface glycoproteins as determinants of the scope of the T-cell repertoire. In addition, sequence polymorphism in the promoter-proximal transcriptional regulatory regions of these genes has been described, particularly for the HLA-DQB1 locus, where these differences may contribute to variation in locus- and allele-specific expression. In this study, we measured the effect of such regulatory sequence polymorphism on the expression of endogenous alleles of DQB1 in heterozygous cells. Quantitative reverse transcriptase-mediated PCR analysis showed that expression of the DQB1*0301 allele responded more rapidly to gamma interferon induction than that of DQB1*0302. We have analyzed functional effects of a prominent allelic polymorphism that consists of a TG dinucleotide present between the W and X1 consensus elements in the DQB1*0302 allele but missing in the DQB1*0301 allele. The dominant effect of this polymorphism was to introduce a variation in the spacing between the W and X1 elements of these two alleles. A secondary compensatory effect was specific for the TG dinucleotide itself, which was essential for the binding of a nuclear protein complex to the *0302 regulatory region immediately 5' of the X1 element. Derivatives of the DQB1 5' regulatory region were used to drive expression of the chloramphenicol acetyltransferase gene in transient transfections of human B-lymphoblastoid and gamma interferon-treated melanoma cell lines, demonstrating that the additional spacing between the W and X1 elements caused by the presence of the TG dinucleotide in the *0302 allele resulted in reduced expression compared with that driven by the *0301 fragment; this difference overshadowed an up-regulating effect on expression which corresponded to the binding of the TG-dependent nuclear protein complex. The

  15. Nemertean toxin genes revealed through transcriptome sequencing.

    PubMed

    Whelan, Nathan V; Kocot, Kevin M; Santos, Scott R; Halanych, Kenneth M

    2014-12-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63-74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  16. Nemertean Toxin Genes Revealed through Transcriptome Sequencing

    PubMed Central

    Whelan, Nathan V.; Kocot, Kevin M.; Santos, Scott R.; Halanych, Kenneth M.

    2014-01-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63–74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  17. A multistep bioinformatic approach detects putative regulatory elements in gene promoters

    PubMed Central

    Bortoluzzi, Stefania; Coppe, Alessandro; Bisognin, Andrea; Pizzi, Cinzia; Danieli, Gian Antonio

    2005-01-01

    Background Searching for approximate patterns in large promoter sequences frequently produces an exceedingly high numbers of results. Our aim was to exploit biological knowledge for definition of a sheltered search space and of appropriate search parameters, in order to develop a method for identification of a tractable number of sequence motifs. Results Novel software (COOP) was developed for extraction of sequence motifs, based on clustering of exact or approximate patterns according to the frequency of their overlapping occurrences. Genomic sequences of 1 Kb upstream of 91 genes differentially expressed and/or encoding proteins with relevant function in adult human retina were analyzed. Methodology and results were tested by analysing 1,000 groups of putatively unrelated sequences, randomly selected among 17,156 human gene promoters. When applied to a sample of human promoters, the method identified 279 putative motifs frequently occurring in retina promoters sequences. Most of them are localized in the proximal portion of promoters, less variable in central region than in lateral regions and similar to known regulatory sequences. COOP software and reference manual are freely available upon request to the Authors. Conclusion The approach described in this paper seems effective for identifying a tractable number of sequence motifs with putative regulatory role. PMID:15904489

  18. Cotyledon nuclear proteins bind to DNA fragments harboring regulatory elements of phytohemagglutinin genes.

    PubMed Central

    Riggs, C D; Voelker, T A; Chrispeels, M J

    1989-01-01

    The effects of deleting DNA sequences upstream from the phytohemagglutinin-L gene of Phaseolus vulgaris have been examined with respect to the level of gene product produced in the seeds of transgenic tobacco. Our studies indicate that several upstream regions quantitatively modulate expression. Between -1000 and -675, a negative regulatory element reduces expression approximately threefold relative to shorter deletion mutants that do not contain this region. Positive regulatory elements lie between -550 and -125 and, compared with constructs containing only 125 base pairs of upstream sequences (-125), the presence of these two regions can be correlated with a 25-fold and a 200-fold enhancement of phytohemagglutinin-L levels. These experiments were complemented by gel retardation assays, which demonstrated that two of the three regions bind cotyledon nuclear proteins from mid-mature seeds. One of the binding sites maps near a DNA sequence that is highly homologous to protein binding domains located upstream from the soybean seed lectin and Kunitz trypsin inhibitor genes. Competition experiments demonstrated that the upstream regions of a bean beta-phaseolin gene, the soybean seed lectin gene, and an oligonucleotide from the upstream region of the trypsin inhibitor gene can compete differentially for factor binding. We suggest that these legume genes may be regulated in part by evolutionarily conserved protein/DNA interactions. PMID:2535513

  19. Highly recurring sequence elements identified in eukaryotic DNAs by computer analysis are often homologous to regulatory sequences or protein binding sites.

    PubMed Central

    Bodnar, J W; Ward, D C

    1987-01-01

    We have used computer assisted dot matrix and oligonucleotide frequency analyses to identify highly recurring sequence elements of 7-11 base pairs in eukaryotic genes and viral DNAs. Such elements are found much more frequently than expected, often with an average spacing of a few hundred base pairs. Furthermore, the most abundant repetitive elements observed in the ovalbumin locus, the beta-globin gene cluster, the metallothionein gene and the viral genomes of SV40, polyoma, Herpes simplex-1 and Mouse Mammary Tumor Virus were sequences shown previously to be protein binding sites or sequences important for regulating gene expression. These sequences were present in both exons and introns as well as promoter regions. These observations suggest that such sequences are often highly overrepresented within the specific gene segments with which they are associated. Computer analysis of other genetic units, including viral genomes and oncogenes, has identified a number of highly recurring sequence elements that could serve similar regulatory or protein-binding functions. A model for the role of such reiterated sequence elements in DNA organization and function is presented. PMID:3822840

  20. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  1. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  2. Regulatory links between imprinted genes: evolutionary predictions and consequences

    PubMed Central

    Patten, Manus M.; Cowley, Michael; Oakey, Rebecca J.; Feil, Robert

    2016-01-01

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species. PMID:26842569

  3. Full-length minor ampullate spidroin gene sequence.

    PubMed

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  4. Full-Length Minor Ampullate Spidroin Gene Sequence

    PubMed Central

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  5. The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart

    PubMed Central

    Li, Xing; Thiagarajan, Raghuram; Nelson, Timothy J.; Tomita-Mitchell, Aoy; Beard, Daniel A.

    2014-01-01

    Cardiac development is a complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. Approximately 200 genes of interest were input into the algorithm to generate putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of putative networks are merged and filtered to generate scale-free, hierarchical networks that are statistically significant and biologically relevant. The networks are validated with known gene interactions and used to predict regulatory pathways important for the developing mammalian heart. Area under the precision-recall curve and receiver operator characteristic curve are 9% and 58%, respectively. Of the top 10 ranked predicted interactions, 4 have already been validated. The algorithm is further tested using a network enriched with known interactions and another depleted of them. The inferred networks contained more interactions for the enriched network versus the depleted network. In all test cases, maximum performance of the algorithm was achieved when the purely data-driven method of network inference was combined with a data-independent, functional-based association method. Lastly, the network generated from the list of approximately 200 genes of interest was expanded using gene-profile uniqueness metrics to include approximately 900 additional known mouse genes and to form the most likely cardiogenic gene regulatory network. The resultant network supports known regulatory interactions and contains several novel cardiogenic regulatory interactions. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation. PMID:24971943

  6. Gene regulatory networks elucidating Huanglongbing disease mechanisms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional...

  7. Gene Regulatory Evolution During Speciation in a Songbird

    PubMed Central

    Davidson, John H.; Balakrishnan, Christopher N.

    2016-01-01

    Over the last decade, tremendous progress has been made toward a comparative understanding of gene regulatory evolution. However, we know little about how gene regulation evolves in birds, and how divergent genomes interact in their hybrids. Because of the unique features of birds – female heterogamety, a highly conserved karyotype, and the slow evolution of reproductive incompatibilities – an understanding of regulatory evolution in birds is critical to a comprehensive understanding of regulatory evolution and its implications for speciation. Using a novel complement of analyses of replicated RNA-seq libraries, we demonstrate abundant divergence in brain gene expression between zebra finch (Taeniopygia guttata) subspecies. By comparing parental populations and their F1 hybrids, we also show that gene misexpression is relatively rare among brain-expressed transcripts in male birds. If this pattern is consistent across tissues and sexes, it may partially explain the slow buildup of postzygotic reproductive isolation observed in birds relative to other taxa. Although we expected that the action of genetic drift on the island-dwelling zebra finch subspecies would be manifested in a higher rate of trans regulatory divergence, we found that most divergence was in cis regulation, following a pattern commonly observed in other taxa. Thus, our study highlights both unique and shared features of avian regulatory evolution. PMID:26976438

  8. Gene Regulatory Evolution During Speciation in a Songbird.

    PubMed

    Davidson, John H; Balakrishnan, Christopher N

    2016-01-01

    Over the last decade, tremendous progress has been made toward a comparative understanding of gene regulatory evolution. However, we know little about how gene regulation evolves in birds, and how divergent genomes interact in their hybrids. Because of the unique features of birds - female heterogamety, a highly conserved karyotype, and the slow evolution of reproductive incompatibilities - an understanding of regulatory evolution in birds is critical to a comprehensive understanding of regulatory evolution and its implications for speciation. Using a novel complement of analyses of replicated RNA-seq libraries, we demonstrate abundant divergence in brain gene expression between zebra finch (Taeniopygia guttata) subspecies. By comparing parental populations and their F1 hybrids, we also show that gene misexpression is relatively rare among brain-expressed transcripts in male birds. If this pattern is consistent across tissues and sexes, it may partially explain the slow buildup of postzygotic reproductive isolation observed in birds relative to other taxa. Although we expected that the action of genetic drift on the island-dwelling zebra finch subspecies would be manifested in a higher rate of trans regulatory divergence, we found that most divergence was in cis regulation, following a pattern commonly observed in other taxa. Thus, our study highlights both unique and shared features of avian regulatory evolution. PMID:26976438

  9. Pl-Bh, an Anthocyanin Regulatory Gene of Maize That Leads to Variegated Pigmentation

    PubMed Central

    Cocciolone, S. M.; Cone, K. C.

    1993-01-01

    Anthocyanins are purple pigments that can be produced in virtually all parts of the maize plant. The spatial distribution of anthocyanin synthesis is dictated by the organ-specific expression of a few regulatory genes that control the transcription of the structural genes. The regulatory genes are grouped into families based on functional identity and DNA sequence similarity. The C1/Pl gene family consists of C1, which controls pigmentation of the kernel, and Pl, which controls pigmentation of the vegetative and floral organs. We have determined the relationship of another gene, Blotched (Bh), to the C1 gene family. Bh was originally described as a gene that conditions blotches of pigmentation in kernels homozygous for recessive c1, suggesting that Bh could functionally replace C1 in the kernel. Our genetic and molecular analyses indicate that Bh is an allele of Pl, that we designate Pl-Bh. Pl-Bh differs from wild-type Pl alleles in two respects. In contrast to the uniform pigmentation observed in plants carrying Pl, the pattern of pigmentation in plants carrying Pl-Bh is variegated. Pl-Bh leads to variegated pigmentation in virtually all tissues of the plant, including the kernel, an organ not pigmented by other Pl alleles. To address the molecular basis for the unusual pattern of expression of Pl-Bh, we cloned and sequenced the gene. The nucleotide sequence of Pl-Bh showed only a single base-pair difference from that of Pl. However, genomic DNA sequences associated with Pl-Bh were found to be hypermethylated relative to the same sequences around the wild-type Pl allele. The methylation was inversely correlated with Pl mRNA levels in variegated plant tissues. Thus, we conclude that DNA methylation may play a role in regulating Pl-Bh expression. PMID:7694886

  10. Gene Regulatory Networks Activated during Chronic Tuberculosis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chronic tuberculosis represents a burden for most of world’s population. Several genes were found to be up-regulated at the late stage of chronic tuberculosis when DNA microarray protocol was used to analyze murine tuberculosis. Rv0348 is a potential transcriptional regulator that is highly expresse...

  11. Functional Evolution of cis-Regulatory Modules at a Homeotic Gene in Drosophila

    PubMed Central

    Schiller, Benjamin J.; Bae, Esther; Tran, Diana A.; Shur, Andrey S.; Allen, John M.; Rau, Christoph; Bender, Welcome; Fisher, William W.; Celniker, Susan E.; Drewell, Robert A.

    2009-01-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules. PMID:19893611

  12. The structure and function of the regulatory elements of the Escherichia coli uvrB gene.

    PubMed Central

    van den Berg, E; Zwetsloot, J; Noordermeer, I; Pannekoek, H; Dekker, B; Dijkema, R; van Ormondt, H

    1981-01-01

    The construction and properties of recombinant plasmids carrying the Escherichia coli uvrB gene, including its transcriptional- and translational regulatory elements, is reported. The DNA sequence of the region, which governs the expression of the uvrB gene, has been determined. Within this sequence two non-overlapping DNA segments match the model sequence for Escherichia coli promoters (1). The '-10 regions' and the '-35 regions' of the proposed uvrB promoters are, respectively, 5'TAAAAT (P1), 5'TATAAT (P2) and 5'TTGGCA (P1), 5'GTGATG (P2). The existence and the position of these promoters has been established by elimination of one promoter (P2), using molecular cloning procedures, by length measurements of in vitro synthesized 'run-off' transcripts and by protection of the uvrB regulatory region for S1 nuclease digestion using in vivo made RNA. Potential sites of interaction within the uvrB regulatory region with regulatory proteins, such as the LexA protein (2) and the UvrC protein (3) are discussed. Images PMID:6273801

  13. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design. PMID:26833485

  14. Cloning, sequencing, and expression of bacteriophage BF23 late genes 24 and 25 encoding tail proteins.

    PubMed Central

    Nakayama, S; Kaneko, T; Ishimaru, H; Moriwaki, H; Mizobuchi, K

    1994-01-01

    Two bacteriophage BF23 late genes, genes 24 and 25, were isolated on a 7.4-kb PstI fragment from the phage DNA, and their nucleotide sequences were determined. Gene 24 encodes a minor tail protein with the expected M(r) of 34,309, and gene 25 located 4 bp upstream of gene 24 encodes a major tail protein with the expected M(r) of 50,329. When total cellular RNA isolated from either phage-infected cells or cells bearing the cloned genes was analyzed by the primer extension method using the primers specific to either gene 25 or gene 24, we identified a possible late gene promoter, designated P25, in the 5'-flanking region of gene 25. This promoter was similar in structure to Escherichia coli promoters for sigma 70. Studies of the translational gene 25- and gene 24-lacZ fusions in the cloned gene system revealed that the promoter P25 was responsible for the expression of both genes 25 and 24 even in the absence of the regulatory genes which were absolutely required for late gene expression in the normal phage-infected cells. These results indicate that the two genes constitute an operon under the control of P25 and that the regulatory gene products of BF23 do not participate directly in specifying the late gene promoter. Images PMID:7961500

  15. Identification and characterization of the afsR homologue regulatory gene from Streptomyces peucetius ATCC 27952.

    PubMed

    Parajuli, Niranjan; Viet, Hung Trinh; Ishida, Kenji; Tong, Hang Thi; Lee, Hei Chan; Liou, Kwangkyoung; Sohng, Jae Kyung

    2005-01-01

    We have isolated an afsR homologue, called afsR-p, through genome analysis of Streptomyces peucetius ATCC 27952. AfsR-p shares 60% sequence identity with AfsR from Streptomyces coelicolor A3 (2). afsR-p was expressed under the control of the ermE* promoter in its hosts S. peucetius, Streptomyces lividans TK 24, Streptomyces clavuligerus and Streptomyces griseus. We observed overproduction of doxorubicin (4-fold) in S. peucetius, gamma-actinorhodin (2.6-fold) in S. lividans, clavulanic acid (1.5-fold) in S. clavuligerus and streptomycin (slight) in S. griseus. Overproduction was due to expression of the gene in these strains as compared to the wild-type strains harboring the vector only. Comparative study of the expression of afsR-p revealed that regulatory networking in Streptomyces is not uniform. We speculate that phosphorylated AfsR-p becomes bound to the promoter region of afsS. The latter activates other regulatory genes, including pathway regulatory genes, and induces the production of secondary metabolites including antibiotics. We identified specific conserved amino acids and exploited them for the isolation of the partial sequence of the afsR homologue from S. clavuligerus and Streptomyces achromogens (rubradirin producer). Such findings provide additional evidence for the presence of a serine/threonine and tyrosine kinase-dependent global regulatory network in Streptomyces. PMID:15921897

  16. A Genome-Wide Regulatory Framework Identifies Maize Pericarp Color1 Controlled Genes[C][W

    PubMed Central

    Morohashi, Kengo; Casas, María Isabel; Ferreyra, Lorena Falcone; Mejía-Guerra, María Katherine; Pourcel, Lucille; Yilmaz, Alper; Feller, Antje; Carvalho, Bruna; Emiliani, Julia; Rodriguez, Eduardo; Pellegrinet, Silvina; McMullen, Michael; Casati, Paula; Grotewold, Erich

    2012-01-01

    Pericarp Color1 (P1) encodes an R2R3-MYB transcription factor responsible for the accumulation of insecticidal flavones in maize (Zea mays) silks and red phlobaphene pigments in pericarps and other floral tissues, which makes P1 an important visual marker. Using genome-wide expression analyses (RNA sequencing) in pericarps and silks of plants with contrasting P1 alleles combined with chromatin immunoprecipitation coupled with high-throughput sequencing, we show here that the regulatory functions of P1 are much broader than the activation of genes corresponding to enzymes in a branch of flavonoid biosynthesis. P1 modulates the expression of several thousand genes, and ∼1500 of them were identified as putative direct targets of P1. Among them, we identified F2H1, corresponding to a P450 enzyme that converts naringenin into 2-hydroxynaringenin, a key branch point in the P1-controlled pathway and the first step in the formation of insecticidal C-glycosyl flavones. Unexpectedly, the binding of P1 to gene regulatory regions can result in both gene activation and repression. Our results indicate that P1 is the major regulator for a set of genes involved in flavonoid biosynthesis and a minor modulator of the expression of a much larger gene set that includes genes involved in primary metabolism and production of other specialized compounds. PMID:22822204

  17. Asymmetric Regulation of Peripheral Genes by Two Transcriptional Regulatory Networks

    PubMed Central

    Li, Jing-Ru; Suzuki, Takahiro; Nishimura, Hajime; Kishima, Mami; Maeda, Shiori; Suzuki, Harukazu

    2016-01-01

    Transcriptional regulatory network (TRN) reconstitution and deconstruction occur simultaneously during reprogramming; however, it remains unclear how the starting and targeting TRNs regulate the induction and suppression of peripheral genes. Here we analyzed the regulation using direct cell reprogramming from human dermal fibroblasts to monocytes as the platform. We simultaneously deconstructed fibroblastic TRN and reconstituted monocytic TRN; monocytic and fibroblastic gene expression were analyzed in comparison with that of fibroblastic TRN deconstruction only or monocytic TRN reconstitution only. Global gene expression analysis showed cross-regulation of TRNs. Detailed analysis revealed that knocking down fibroblastic TRN positively affected half of the upregulated monocytic genes, indicating that intrinsic fibroblastic TRN interfered with the expression of induced genes. In contrast, reconstitution of monocytic TRN showed neutral effects on the majority of fibroblastic gene downregulation. This study provides an explicit example that demonstrates how two networks together regulate gene expression during cell reprogramming processes and contributes to the elaborate exploration of TRNs. PMID:27483142

  18. Timing of flagellar gene expression in the Caulobacter cell cycle is determined by a transcriptional cascade of positive regulatory genes.

    PubMed Central

    Ohta, N; Chen, L S; Mullin, D A; Newton, A

    1991-01-01

    The Caulobacter crescentus flagellar (fla) genes are organized in a regulatory hierarchy in which genes at each level are required for expression of those at the next lower level. To determine the role of this hierarchy in the timing of fla gene expression, we have examined the organization and cell cycle regulation of genes located in the hook gene cluster. As shown here, this cluster is organized into four multicistronic transcription units flaN, flbG, flaO, and flbF that contain fla genes plus a fifth transcription unit II.1 of unknown function. Transcription unit II.1 is regulated independently of the fla gene hierarchy, and it is expressed with a unique pattern of periodicity very late in the cell cycle. The flaN, flbG, and flaO operons are all transcribed periodically, and flaO, which is near the top of the hierarchy and required in trans for the activation of flaN and flbG operons, is expressed earlier in the cell cycle than the other two transcription units. We have shown that delaying flaO transcription by fusing it to the II.1 promoter also delayed the subsequent expression of the flbG operon and the 27- and 25-kDa flagellin genes that are at the bottom of the regulatory hierarchy. Thus, the sequence and timing of fla gene expression in the cell cycle are determined in large measure by the positions of these genes in the regulatory hierarchy. These results also suggest that periodic transcription is a general feature of fla gene expression in C. crescentus. Images PMID:1847367

  19. Multicolor labeling in developmental gene regulatory network analysis.

    PubMed

    Sethi, Aditya J; Angerer, Robert C; Angerer, Lynne M

    2014-01-01

    The sea urchin embryo is an important model system for developmental gene regulatory network (GRN) analysis. This chapter describes the use of multicolor fluorescent in situ hybridization (FISH) as well as a combination of FISH and immunohistochemistry in sea urchin embryonic GRN studies. The methods presented here can be applied to a variety of experimental settings where accurate spatial resolution of multiple gene products is required for constructing a developmental GRN. PMID:24567220

  20. The Transcriptional and Gene Regulatory Network of Lactococcus lactis MG1363 during Growth in Milk

    PubMed Central

    de Jong, Anne; Hansen, Morten E.; Kuipers, Oscar P.; Kilstrup, Mogens; Kok, Jan

    2013-01-01

    In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs) are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks. PMID:23349698

  1. A gene regulatory network armature for T-lymphocyte specification

    SciTech Connect

    Fung, Elizabeth-sharon

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  2. Efficient experimental design for uncertainty reduction in gene regulatory networks

    PubMed Central

    2015-01-01

    Background An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. Results The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Conclusions Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/. PMID:26423515

  3. Function does not follow form in gene regulatory circuits

    PubMed Central

    Payne, Joshua L.; Wagner, Andreas

    2015-01-01

    Gene regulatory circuits are to the cell what arithmetic logic units are to the chip: fundamental components of information processing that map an input onto an output. Gene regulatory circuits come in many different forms, distinct structural configurations that determine who regulates whom. Studies that have focused on the gene expression patterns (functions) of circuits with a given structure (form) have examined just a few structures or gene expression patterns. Here, we use a computational model to exhaustively characterize the gene expression patterns of nearly 17 million three-gene circuits in order to systematically explore the relationship between circuit form and function. Three main conclusions emerge. First, function does not follow form. A circuit of any one structure can have between twelve and nearly thirty thousand distinct gene expression patterns. Second, and conversely, form does not follow function. Most gene expression patterns can be realized by more than one circuit structure. And third, multifunctionality severely constrains circuit form. The number of circuit structures able to drive multiple gene expression patterns decreases rapidly with the number of these patterns. These results indicate that it is generally not possible to infer circuit function from circuit form, or vice versa. PMID:26290154

  4. Charting gene regulatory networks: strategies, challenges and perspectives

    PubMed Central

    2004-01-01

    One of the foremost challenges in the post-genomic era will be to chart the gene regulatory networks of cells, including aspects such as genome annotation, identification of cis-regulatory elements and transcription factors, information on protein–DNA and protein–protein interactions, and data mining and integration. Some of these broad sets of data have already been assembled for building networks of gene regulation. Even though these datasets are still far from comprehensive, and the approach faces many important and difficult challenges, some strategies have begun to make connections between disparate regulatory events and to foster new hypotheses. In this article we review several different genomics and proteomics technologies, and present bioinformatics methods for exploring these data in order to make novel discoveries. PMID:15080794

  5. Molecular characterization of a maize regulatory gene

    SciTech Connect

    Wessler, S.R.

    1991-12-01

    Based on initial bombardment studies we have previously concluded that promoter diversity was responsible for the diversity of naturally occurring R alleles. During this period we have found that R is controlled at the level of translation initiation and intron 1 is alternatively spliced. The experiments described in Sections 1 and 2 sought to quantify these effects and to determine whether they contribute to the tissue specific expression of select R alleles. This study was done because very little is understood about the post-transcriptional regulation of plant genes. Section 3 and 4 describe experiments designed to identify important structural components of the R protein.

  6. Compartmentalized gene regulatory network of the pathogenic fungus Fusarium graminearum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Head blight caused by Fusarium graminearum (Fg) is a major limiting factor of wheat production with both yield loss and mycotoxin contamination. Here we report a model for global Fg gene regulatory networks (GRNs) inferred from a large collection of transcriptomic data using a machine-learning appro...

  7. Data- and knowledge-based modeling of gene regulatory networks: an update

    PubMed Central

    Linde, Jörg; Schulze, Sylvie; Henkel, Sebastian G.; Guthke, Reinhard

    2015-01-01

    Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions. PMID:27047314

  8. Regulatory hotspots are associated with plant gene expression under varying soil phosphorus supply in Brassica rapa.

    PubMed

    Hammond, John P; Mayes, Sean; Bowen, Helen C; Graham, Neil S; Hayden, Rory M; Love, Christopher G; Spracklen, William P; Wang, Jun; Welham, Sue J; White, Philip J; King, Graham J; Broadley, Martin R

    2011-07-01

    Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have defined cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identification of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efficiency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efficiency identified through eQTL analysis are potential targets for further characterization and may have potential for crop improvement. PMID:21527424

  9. Inferring slowly-changing dynamic gene-regulatory networks

    PubMed Central

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with ℓ1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset. PMID:25917062

  10. Epidermal differentiation gene regulatory networks controlled by MAF and MAFB.

    PubMed

    Labott, Andrew T; Lopez-Pajares, Vanessa

    2016-06-01

    Numerous regulatory factors in epidermal differentiation and their role in regulating different cell states have been identified in recent years. However, the genetic interactions between these regulators over the dynamic course of differentiation have not been studied. In this Extra-View article, we review recent work by Lopez-Pajares et al. that explores a new regulatory network in epidermal differentiation. They analyze the changing transcriptome throughout epidermal regeneration to identify 3 separate gene sets enriched in the progenitor, early and late differentiation states. Using expression module mapping, MAF along with MAFB, are identified as transcription factors essential for epidermal differentiation. Through double knock-down of MAF:MAFB using siRNA and CRISPR/Cas9-mediated knockout, epidermal differentiation was shown to be impaired both in-vitro and in-vivo, confirming MAF:MAFB's role to activate genes that drive differentiation. Lopez-Pajares and collaborators integrated 42 published regulator gene sets and the MAF:MAFB gene set into the dynamic differentiation gene expression landscape and found that lncRNAs TINCR and ANCR act as upstream regulators of MAF:MAFB. Furthermore, ChIP-seq analysis of MAF:MAFB identified key transcription factor genes linked to epidermal differentiation as downstream effectors. Combined, these findings illustrate a dynamically regulated network with MAF:MAFB as a crucial link for progenitor gene repression and differentiation gene activation. PMID:27097296

  11. Establishing the Architecture of Plant Gene Regulatory Networks.

    PubMed

    Yang, F; Ouma, W Z; Li, W; Doseff, A I; Grotewold, E

    2016-01-01

    Gene regulatory grids (GRGs) encompass the space of all the possible transcription factor (TF)-target gene interactions that regulate gene expression, with gene regulatory networks (GRNs) representing a temporal and spatial manifestation of a portion of the GRG, essential for the specification of gene expression. Thus, understanding GRG architecture provides a valuable tool to explain how genes are expressed in an organism, an important aspect of synthetic biology and essential toward the development of the "in silico" cell. Progress has been made in some unicellular model systems (eg, yeast), but significant challenges remain in more complex multicellular organisms such as plants. Key to understanding the organization of GRGs is therefore identifying the genes that TFs bind to, and control. The application of sensitive and high-throughput methods to investigate genome-wide TF-target gene interactions is providing a wealth of information that can be linked to important agronomic traits. We describe here the methods and resources that have been developed to investigate the architecture of plant GRGs and GRNs. We also provide information regarding where to obtain clones or other resources necessary for synthetic biology or metabolic engineering. PMID:27480690

  12. Implicit methods for qualitative modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Mohanram, Kartik; De Micheli, Giovanni; Xenarios, Ioannis

    2012-01-01

    Advancements in high-throughput technologies to measure increasingly complex biological phenomena at the genomic level are rapidly changing the face of biological research from the single-gene single-protein experimental approach to studying the behavior of a gene in the context of the entire genome (and proteome). This shift in research methodologies has resulted in a new field of network biology that deals with modeling cellular behavior in terms of network structures such as signaling pathways and gene regulatory networks. In these networks, different biological entities such as genes, proteins, and metabolites interact with each other, giving rise to a dynamical system. Even though there exists a mature field of dynamical systems theory to model such network structures, some technical challenges are unique to biology such as the inability to measure precise kinetic information on gene-gene or gene-protein interactions and the need to model increasingly large networks comprising thousands of nodes. These challenges have renewed interest in developing new computational techniques for modeling complex biological systems. This chapter presents a modeling framework based on Boolean algebra and finite-state machines that are reminiscent of the approach used for digital circuit synthesis and simulation in the field of very-large-scale integration (VLSI). The proposed formalism enables a common mathematical framework to develop computational techniques for modeling different aspects of the regulatory networks such as steady-state behavior, stochasticity, and gene perturbation experiments. PMID:21938638

  13. Boosting heterologous protein production in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Scheffer, Stanley; Jacobs, Anni; Zambre, Mukund; Zobell, Oliver; Goossens, Alain; Depicker, Ann; Angenon, Geert

    2002-12-01

    Over the past decade, several high value proteins have been produced in different transgenic plant tissues such as leaves, tubers, and seeds. Despite recent advances, many heterologous proteins accumulate to low concentrations, and the optimization of expression cassettes to make in planta production and purification economically feasible remains critical. Here, the regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) of common bean (Phaseolus vulgaris) were evaluated for producing heterologous proteins in dicotyledonous seeds. The murine single chain variable fragment (scFv) G4 (ref. 4) was chosen as model protein because of the current industrial interest in producing antibodies and derived fragments in crops. In transgenic Arabidopsis thaliana seed stocks, the scFv under control of the 35S promoter of the cauliflower mosaic virus (CaMV) accumulated to approximately 1% of total soluble protein (TSP). However, a set of seed storage promoter constructs boosted the scFv accumulation to exceptionally high concentrations, reaching no less than 36.5% of TSP in homozygous seeds. Even at these high concentrations, the scFv proteins had antigen-binding activity and affinity similar to those produced in Escherichia coli. The feasibility of heterologous protein production under control of arc5-I regulatory sequences was also demonstrated in Phaseolus acutifolius, a promising crop for large scale production. PMID:12415287

  14. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  15. Modularity and evolutionary constraints in a baculovirus gene regulatory network

    PubMed Central

    2013-01-01

    Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates

  16. Rhodobase, a meta-analytical tool for reconstructing gene regulatory networks in a model photosynthetic bacterium.

    PubMed

    Moskvin, Oleg V; Bolotin, Dmitry; Wang, Andrew; Ivanov, Pavel S; Gomelsky, Mark

    2011-02-01

    We present Rhodobase, a web-based meta-analytical tool for analysis of transcriptional regulation in a model anoxygenic photosynthetic bacterium, Rhodobacter sphaeroides. The gene association meta-analysis is based on the pooled data from 100 of R. sphaeroides whole-genome DNA microarrays. Gene-centric regulatory networks were visualized using the StarNet approach (Jupiter, D.C., VanBuren, V., 2008. A visual data mining tool that facilitates reconstruction of transcription regulatory networks. PLoS ONE 3, e1717) with several modifications. We developed a means to identify and visualize operons and superoperons. We designed a framework for the cross-genome search for transcription factor binding sites that takes into account high GC-content and oligonucleotide usage profile characteristic of the R. sphaeroides genome. To facilitate reconstruction of directional relationships between co-regulated genes, we screened upstream sequences (-400 to +20bp from start codons) of all genes for putative binding sites of bacterial transcription factors using a self-optimizing search method developed here. To test performance of the meta-analysis tools and transcription factor site predictions, we reconstructed selected nodes of the R. sphaeroides transcription factor-centric regulatory matrix. The test revealed regulatory relationships that correlate well with the experimentally derived data. The database of transcriptional profile correlations, the network visualization engine and the optimized search engine for transcription factor binding sites analysis are available at http://rhodobase.org. PMID:21070832

  17. Short DNA sequences inserted for gene targeting can accidentally interfere with off-target gene expression.

    PubMed

    Meier, Ingo D; Bernreuther, Christian; Tilling, Thomas; Neidhardt, John; Wong, Yong Wee; Schulze, Christian; Streichert, Thomas; Schachner, Melitta

    2010-06-01

    Targeting of genes in mice, a key approach to study development and disease, often leaves a neo cassette, loxP, or FRT sites inserted in the mouse genome. Insertion of neo can influence the expression of neighboring genes, but similar effects have not been reported for loxP sites. We therefore performed microarray analyses of mice in which the Ncam or the Tnr gene were targeted either by insertion of neo or loxP/FRT sites. In the case of Ncam, neo, but not loxP/FRT insertion, led to a 2-fold reduction in mRNA levels of 3 genes located at distances between 0.2 and 3.1 Mb from the target. In contrast, after introduction of loxP/FRT sites into introns of Tnr, we observed a 2.5- to 4-fold reduction in the transcript level of the Gas5 gene, 1.1 Mb away from Tnr, most probably due to disruption of a conserved regulatory element in Tnr. Insertion of short DNA sequences such as loxP/FRT can thus influence off-target mRNA levels if these sites are accidentally placed into regulatory elements. Our results imply that conditional knockout mice should be analyzed for genomic positional side effects that may influence the animals' phenotypes. PMID:20110269

  18. RNA Sequencing of Mouse Sinoatrial Node Reveals an Upstream Regulatory Role for Islet-1 in Cardiac Pacemaker Cells

    PubMed Central

    Vedantham, Vasanth; Galang, Giselle; Evangelista, Melissa; Deo, Rahul C.; Srivastava, Deepak

    2015-01-01

    Rationale Treatment of sinus node disease with regenerative or cell-based therapies will require a detailed understanding of gene regulatory networks in cardiac pacemaker cells (PCs). Objective To characterize the transcriptome of PCs using RNA sequencing, and to identify transcriptional networks responsible for PC gene expression. Methods and Results We used laser capture micro-dissection (LCM) on a sinus node reporter mouse line to isolate RNA from PCs for RNA sequencing (RNA-Seq). Differential expression and network analysis identified novel SAN-enriched genes, and predicted that the transcription factor Islet-1 (Isl1) is active in developing pacemaker cells. RNA-Seq on SAN tissue lacking Isl1 established that Isl1 is an important transcriptional regulator within the developing SAN. Conclusions (1) The PC transcriptome diverges sharply from other cardiomyocytes; (2) Isl1 is a positive transcriptional regulator of the PC gene expression program. PMID:25623957

  19. Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant.

    PubMed

    Cone, K C; Cocciolone, S M; Burr, F A; Burr, B

    1993-12-01

    Genetic studies in maize have identified several regulatory genes that control the tissue-specific synthesis of purple anthocyanin pigments in the plant. c1 regulates pigmentation in the aleurone layer of the kernel, whereas pigmentation in the vegetative and floral tissues of the plant body depends on pl. c1 encodes a protein with the structural features of eukaryotic transcription factors and functions to control the accumulation of transcripts for the anthocyanin biosynthetic genes. Previous genetic and molecular observations have prompted the hypothesis that c1 and pl are functionally duplicate, in that they control the same set of anthocyanin structural genes but in distinct parts of the plant. Here, we show that this proposed functional similarity is reflected by DNA sequence homology between c1 and pl. Using a c1 DNA fragment as a hybridization probe, genomic and cDNA clones for pl were isolated. Comparison of pl and c1 cDNA sequences revealed that the genes encode proteins with 90% or more amino acid identity in the amino- and carboxyl-terminal domains that are known to be important for the regulatory function of the C1 protein. Consistent with the idea that the pl gene product also acts as a transcriptional activator is our finding that a functional pl allele is required for the transcription of at least three structural genes in the anthocyanin biosynthetic pathway. PMID:8305872

  20. Gap Gene Regulatory Dynamics Evolve along a Genotype Network

    PubMed Central

    Crombach, Anton; Wotton, Karl R.; Jiménez-Guri, Eva; Jaeger, Johannes

    2016-01-01

    Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as “system drift.” System drift is illustrated by the gap gene network—involved in segmental patterning—in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of “genotype networks” and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability). PMID:26796549

  1. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation

    PubMed Central

    Goode, Debbie K.; Obier, Nadine; Vijayabaskar, M.S.; Lie-A-Ling, Michael; Lilly, Andrew J.; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A.; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R.; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-01-01

    Summary Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  2. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation.

    PubMed

    Goode, Debbie K; Obier, Nadine; Vijayabaskar, M S; Lie-A-Ling, Michael; Lilly, Andrew J; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-03-01

    Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  3. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  4. Motif for controllable toggle switch in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhao, Chen; Bin, Ao; Ye, Weiming; Fan, Ying; Di, Zengru

    2015-02-01

    Toggle switch as a common phenomenon in gene regulatory networks has been recognized important for biological functions. Despite much effort dedicated to understanding the toggle switch and designing synthetic biology circuit to achieve the biological function, we still lack a comprehensive understanding of the intrinsic dynamics behind such phenomenon and the minimum structure that is imperative for producing toggle switch. In this paper, we discover a minimum structure, a motif that enables a controllable toggle switch. In particular, the motif consists of a transformative double negative feedback loop (DNFL) that is regulated by an additional driver node. By enumerating all possible regulatory configurations from the driver node, we identify two types of motifs associated with the toggle switch that is captured by the existence of bistable states. The toggle switch is controllable in the sense that the gap between the bistable states is adjustable as determined by the regulatory strength from the driver nodes. We test the effect of the motifs in self-oscillating gene regulatory network (SON) with respect to the interplay between the motifs and the other genes, and find that the switching dynamics of the whole network can be successfully controlled insofar as the network contains a single motif. Our findings are important to uncover the underlying nonlinear dynamics of controllable toggle switch and can have implications in devising biology circuit in the field of synthetic biology.

  5. Mapping gene regulatory circuitry of Pax6 during neurogenesis

    PubMed Central

    Thakurela, Sudhir; Tiwari, Neha; Schick, Sandra; Garding, Angela; Ivanek, Robert; Berninger, Benedikt; Tiwari, Vijay K

    2016-01-01

    Pax6 is a highly conserved transcription factor among vertebrates and is important in various aspects of the central nervous system development. However, the gene regulatory circuitry of Pax6 underlying these functions remains elusive. We find that Pax6 targets a large number of promoters in neural progenitors cells. Intriguingly, many of these sites are also bound by another progenitor factor, Sox2, which cooperates with Pax6 in gene regulation. A combinatorial analysis of Pax6-binding data set with transcriptome changes in Pax6-deficient neural progenitors reveals a dual role for Pax6, in which it activates the neuronal (ectodermal) genes while concurrently represses the mesodermal and endodermal genes, thereby ensuring the unidirectionality of lineage commitment towards neuronal differentiation. Furthermore, Pax6 is critical for inducing activity of transcription factors that elicit neurogenesis and repress others that promote non-neuronal lineages. In addition to many established downstream effectors, Pax6 directly binds and activates a number of genes that are specifically expressed in neural progenitors but have not been previously implicated in neurogenesis. The in utero knockdown of one such gene, Ift74, during brain development impairs polarity and migration of newborn neurons. These findings demonstrate new aspects of the gene regulatory circuitry of Pax6, revealing how it functions to control neuronal development at multiple levels to ensure unidirectionality and proper execution of the neurogenic program. PMID:27462442

  6. Topological origin of global attractors in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhang, YunJun; Ouyang, Qi; Geng, Zhi

    2015-02-01

    Fixed-point attractors with global stability manifest themselves in a number of gene regulatory networks. This property indicates the stability of regulatory networks against small state perturbations and is closely related to other complex dynamics. In this paper, we aim to reveal the core modules in regulatory networks that determine their global attractors and the relationship between these core modules and other motifs. This work has been done via three steps. Firstly, inspired by the signal transmission in the regulation process, we extract the model of chain-like network from regulation networks. We propose a module of "ideal transmission chain (ITC)", which is proved sufficient and necessary (under certain condition) to form a global fixed-point in the context of chain-like network. Secondly, by examining two well-studied regulatory networks (i.e., the cell-cycle regulatory networks of Budding yeast and Fission yeast), we identify the ideal modules in true regulation networks and demonstrate that the modules have a superior contribution to network stability (quantified by the relative size of the biggest attraction basin). Thirdly, in these two regulation networks, we find that the double negative feedback loops, which are the key motifs of forming bistability in regulation, are connected to these core modules with high network stability. These results have shed new light on the connection between the topological feature and the dynamic property of regulatory networks.

  7. Repressive BMP2 gene regulatory elements near the BMP2 promoter

    SciTech Connect

    Jiang, Shan; Chandler, Ronald L.; Fritz, David T.; Mortlock, Douglas P.; Rogers, Melissa B.

    2010-02-05

    The level of bone morphogenetic protein 2 (BMP2) profoundly influences essential cell behaviors such as proliferation, differentiation, apoptosis, and migration. The spatial and temporal pattern of BMP2 synthesis, particular in diverse embryonic cells, is highly varied and dynamic. We have identified GC-rich sequences within the BMP2 promoter region that strongly repress gene expression. These elements block the activity of a highly conserved, osteoblast enhancer in response to FGF2 treatment. Both positive and negative gene regulatory elements control BMP2 synthesis. Detecting and mapping the repressive motifs is essential because they impede the identification of developmentally regulated enhancers necessary for normal BMP2 patterns and concentration.

  8. Gene structure, regulatory control, and evolution of black widow venom latrotoxins

    PubMed Central

    Bhere, Kanaka Varun; Haney, Robert A.; Ayoub, Nadia A.; Garb, Jessica E.

    2014-01-01

    Black widow venom contains α-latrotoxin, infamous for causing intense pain. Combining 33 kb of Latrodectus hesperus genomic DNA with RNA-Seq, we characterized the α-latrotoxin gene and discovered a paralog, 4.5 kb downstream. Both paralogs exhibit venom gland specific transcription, and may be regulated post-transcriptionally via musashi-like proteins. A 4 kb intron interrupts the α-latrotoxin coding sequence, while a 10 kb intron in the 3′ UTR of the paralog may cause nonsense-mediated decay. Phylogenetic analysis confirms these divergent latrotoxins diversified through recent tandem gene duplications. Thus, latrotoxin genes have more complex structures, regulatory controls, and sequence diversity than previously proposed. PMID:25217831

  9. EXAMINE: a computational approach to reconstructing gene regulatory networks.

    PubMed

    Deng, Xutao; Geng, Huimin; Ali, Hesham

    2005-08-01

    Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of linear models has remained unclear. We address these problems by developing an improved method, EXpression Array MINing Engine (EXAMINE), to infer gene regulatory networks from time-series gene expression data sets. EXAMINE takes advantage of sparse graph theory to overcome the excessive-parameter problem with an adaptive-connectivity model and fitting algorithm. EXAMINE also guarantees that the most parsimonious network structure will be found with its incremental adaptive fitting process. Compared to previous linear models, where a fully connected model is used, EXAMINE reduces the number of parameters by O(N), thereby increasing the chance of recovering the underlying regulatory network. The fitting algorithm increments the connectivity during the fitting process until a satisfactory fit is obtained. We performed a systematic study to explore the data mining ability of linear models. A guideline for using linear models is provided: If the system is small (3-20 elements), more than 90% of the regulation pathways can be determined correctly. For a large-scale system, either clustering is needed or it is necessary to integrate information in addition to expression profile. Coupled with the clustering method, we applied EXAMINE to rat central nervous system development (CNS) data with 112 genes. We were able to efficiently generate regulatory networks with statistically significant pathways that have been predicted previously. PMID:15951103

  10. Genome-Wide Identification of Regulatory Elements and Reconstruction of Gene Regulatory Networks of the Green Alga Chlamydomonas reinhardtii under Carbon Deprivation

    PubMed Central

    Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

    2013-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can

  11. Gene therapy for cancer: regulatory considerations for approval

    PubMed Central

    Husain, S R; Han, J; Au, P; Shannon, K; Puri, R K

    2015-01-01

    The rapidly changing field of gene therapy promises a number of innovative treatments for cancer patients. Advances in genetic modification of cancer and immune cells and the use of oncolytic viruses and bacteria have led to numerous clinical trials for cancer therapy, with several progressing to late-stage product development. At the time of this writing, no gene therapy product has been approved by the United States Food and Drug Administration (FDA). Some of the key scientific and regulatory issues include understanding of gene transfer vector biology, safety of vectors in vitro and in animal models, optimum gene transfer, long-term persistence or integration in the host, shedding of a virus and ability to maintain transgene expression in vivo for a desired period of time. Because of the biological complexity of these products, the FDA encourages a flexible, data-driven approach for preclinical safety testing programs. The clinical trial design should be based on the unique features of gene therapy products, and should ensure the safety of enrolled subjects. This article focuses on regulatory considerations for gene therapy product development and also discusses guidance documents that have been published by the FDA. PMID:26584531

  12. Gene expression in maturing neurons: regulatory mechanisms and related neurodevelopmental disorders.

    PubMed

    Ding, Baojin

    2015-04-25

    During the central nervous system (CNS) development, the interactions between intrinsic genes and extrinsic environment ensure that each neuronal developmental stage (eg. neuronal proliferation, differentiation, migration, axon extension, dendritogenesis and formation of functional synapses) occurs in the proper timing and sequence. The successful coordination requires that numerous groups of genes are exquisitely regulated in a spatiotemporal manner by various regulatory mechanisms, including sequence-specific DNA-binding proteins, histone modifications, DNA methylation, chromatin remodeling, and microRNAs (miRNAs). By targeting chromatin structure, transcription and translation processes, these mechanisms form a regulatory network to accomplish the fine regulation of gene expression in response to environmental stimuli at different developmental stages. Dysregulation of the gene expression during neuronal development has been shown to be implicated in a number of neurodevelopmental disorders, such as autism spectrum disorders (ASD), Rett syndrome (RTT), Fragile-X syndrome (FXS) and other genetic diseases. The further understanding of the regulation of gene expression during neuronal development may provide new approaches for the diagnosis and treatment of these disorders. PMID:25896042

  13. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  14. Using gene expression programming to infer gene regulatory networks from time-series data.

    PubMed

    Zhang, Yongqing; Pu, Yifei; Zhang, Haisen; Su, Yabo; Zhang, Lifang; Zhou, Jiliu

    2013-12-01

    Gene regulatory networks inference is currently a topic under heavy research in the systems biology field. In this paper, gene regulatory networks are inferred via evolutionary model based on time-series microarray data. A non-linear differential equation model is adopted. Gene expression programming (GEP) is applied to identify the structure of the model and least mean square (LMS) is used to optimize the parameters in ordinary differential equations (ODEs). The proposed work has been first verified by synthetic data with noise-free and noisy time-series data, respectively, and then its effectiveness is confirmed by three real time-series expression datasets. Finally, a gene regulatory network was constructed with 12 Yeast genes. Experimental results demonstrate that our model can improve the prediction accuracy of microarray time-series data effectively. PMID:24140883

  15. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    PubMed Central

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-01-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution. PMID:26184971

  16. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    NASA Astrophysics Data System (ADS)

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-07-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution.

  17. From System-Wide Differential Gene Expression to Perturbed Regulatory Factors: A Combinatorial Approach

    PubMed Central

    Mahajan, Gaurang; Mande, Shekhar C.

    2015-01-01

    High-throughput experiments such as microarrays and deep sequencing provide large scale information on the pattern of gene expression, which undergoes extensive remodeling as the cell dynamically responds to varying environmental cues or has its function disrupted under pathological conditions. An important initial step in the systematic analysis and interpretation of genome-scale expression alteration involves identification of a set of perturbed transcriptional regulators whose differential activity can provide a proximate hypothesis to account for these transcriptomic changes. In the present work, we propose an unbiased and logically natural approach to transcription factor enrichment. It involves overlaying a list of experimentally determined differentially expressed genes on a background regulatory network coming from e.g. literature curation or computational motif scanning, and identifying that subset of regulators whose aggregated target set best discriminates between the altered and the unaffected genes. In other words, our methodology entails testing of all possible regulatory subnetworks, rather than just the target sets of individual regulators as is followed in most standard approaches. We have proposed an iterative search method to efficiently find such a combination, and benchmarked it on E. coli microarray and regulatory network data available in the public domain. Comparative analysis carried out on artificially generated differential expression profiles, as well as empirical factor overexpression data for M. tuberculosis, shows that our methodology provides marked improvement in accuracy of regulatory inference relative to the standard method that involves evaluating factor enrichment in an individual manner. PMID:26562430

  18. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  19. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  20. Identification of C4 photosynthesis metabolism and regulatory-associated genes in Eleocharis vivipara by SSH.

    PubMed

    Chen, Taiyu; Ye, Rongjian; Fan, Xiaolei; Li, Xianghua; Lin, Yongjun

    2011-09-01

    This is the first effort to investigate the candidate genes involved in kranz developmental regulation and C(4) metabolic fluxes in Eleocharis vivipara, which is a leafless freshwater amphibious plant and possesses a distinct culms anatomy structure and photosynthetic pattern in contrasting environments. A terrestrial specific SSH library was constructed to investigate the genes involved in kranz anatomy developmental regulation and C(4) metabolic fluxes. A total of 73 ESTs and 56 unigenes in 384 clones were identified by array hybridization and sequencing. In total, 50 unigenes had homologous genes in the databases of rice and Arabidopsis. The real-time quantitative PCR results showed that most of the genes were accumulated in terrestrial culms and ABA-induced culms. The C(4) marker genes were stably accumulated during the culms development process in terrestrial culms. With respect to C(3) culms, C(4) photosynthesis metabolism consumed much more transporters and translocators related to ion metabolism, organic acids and carbohydrate metabolism, phosphate metabolism, amino acids metabolism, and lipids metabolism. Additionally, ten regulatory genes including five transcription factors, four receptor-like proteins, and one BURP protein were identified. These regulatory genes, which co-accumulated with the culms developmental stages, may play important roles in culms structure developmental regulation, bundle sheath chloroplast maturation, and environmental response. These results shed new light on the C(4) metabolic fluxes, environmental response, and anatomy structure developmental regulation in E. vivipara. PMID:21739352

  1. Roles of lignin biosynthesis and regulatory genes in plant development.

    PubMed

    Yoon, Jinmi; Choi, Heebak; An, Gynheung

    2015-11-01

    Lignin is an important factor affecting agricultural traits, biofuel production, and the pulping industry. Most lignin biosynthesis genes and their regulatory genes are expressed mainly in the vascular bundles of stems and leaves, preferentially in tissues undergoing lignification. Other genes are poorly expressed during normal stages of development, but are strongly induced by abiotic or biotic stresses. Some are expressed in non-lignifying tissues such as the shoot apical meristem. Alterations in lignin levels affect plant development. Suppression of lignin biosynthesis genes causes abnormal phenotypes such as collapsed xylem, bending stems, and growth retardation. The loss of expression by genes that function early in the lignin biosynthesis pathway results in more severe developmental phenotypes when compared with plants that have mutations in later genes. Defective lignin deposition is also associated with phenotypes of seed shattering or brittle culm. MYB and NAC transcriptional factors function as switches, and some homeobox proteins negatively control lignin biosynthesis genes. Ectopic deposition caused by overexpression of lignin biosynthesis genes or master switch genes induces curly leaf formation and dwarfism. PMID:26297385

  2. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes

    PubMed Central

    Yao, Lijing; Berman, Benjamin P.; Farnham, Peggy J.

    2015-01-01

    Abstract Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer–target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers. PMID:26446758

  3. Functional Studies of Regulatory Genes in the Sea Urchin Embryo

    NASA Astrophysics Data System (ADS)

    Cavalieri, Vincenzo; Bernardo, Maria Di; Spinelli, Giovanni

    Sea urchin embryos are characterized by an extremely simple mode of development, rapid cleavage, high transparency, and well-defined cell lineage. Although they are not suitable for genetic studies, other approaches are successfully used to unravel mechanisms and molecules involved in cell fate specification and morphogenesis. Microinjection is the elective method to study gene function in sea urchin embryos. It is used to deliver precise amounts of DNA, RNA, oligonucleotides, peptides, or antibodies into the eggs or even into blastomeres. Here we describe microinjection as it is currently applied in our laboratory and show how it has been used in gene perturbation analyses and dissection of cis-regulatory DNA elements.

  4. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  5. An Arabidopsis Gene Regulatory Network for Secondary Cell Wall Synthesis

    PubMed Central

    Taylor-Teeples, M; Lin, L; de Lucas, M; Turco, G; Toal, TW; Gaudinier, A; Young, NF; Trabucco, GM; Veling, MT; Lamothe, R; Handakumbura, PP; Xiong, G; Wang, C; Corwin, J; Tsoukalas, A; Zhang, L; Ware, D; Pauly, M; Kliebenstein, DJ; Dehesh, K; Tagkopoulos, I; Breton, G; Pruneda-Paz, JL; Ahnert, SE; Kay, SA; Hazen, SP; Brady, SM

    2014-01-01

    Summary The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component. PMID:25533953

  6. Phase transitions in the evolution of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Skanata, Antun; Kussell, Edo

    The role of gene regulatory networks is to respond to environmental conditions and optimize growth of the cell. A typical example is found in bacteria, where metabolic genes are activated in response to nutrient availability, and are subsequently turned off to conserve energy when their specific substrates are depleted. However, in fluctuating environmental conditions, regulatory networks could experience strong evolutionary pressures not only to turn the right genes on and off, but also to respond optimally under a wide spectrum of fluctuation timescales. The outcome of evolution is predicted by the long-term growth rate, which differentiates between optimal strategies. Here we present an analytic computation of the long-term growth rate in randomly fluctuating environments, by using mean-field and higher order expansion in the environmental history. We find that optimal strategies correspond to distinct regions in the phase space of fluctuations, separated by first and second order phase transitions. The statistics of environmental randomness are shown to dictate the possible evolutionary modes, which either change the structure of the regulatory network abruptly, or gradually modify and tune the interactions between its components.

  7. Transcriptional Targeting in the Airway Using Novel Gene Regulatory Elements

    PubMed Central

    Burnight, Erin R.; Wang, Guoshun; McCray, Paul B.

    2012-01-01

    The delivery of cystic fibrosis transmembrane conductance regulator (CFTR) to airway epithelia is a goal of many gene therapy strategies to treat cystic fibrosis. Because the native regulatory elements of the CFTR are not well characterized, the development of vectors with heterologous promoters of varying strengths and specificity would aid in our selection of optimal reagents for the appropriate expression of the vector-delivered CFTR gene. Here we contrasted the performance of several novel gene-regulatory elements. Based on airway expression analysis, we selected putative regulatory elements from BPIFA1 and WDR65 to investigate. In addition, we selected a human CFTR promoter region (∼ 2 kb upstream of the human CFTR transcription start site) to study. Using feline immunodeficiency virus vectors containing the candidate elements driving firefly luciferase, we transduced murine nasal epithelia in vivo. Luciferase expression persisted for 30 weeks, which was the duration of the experiment. Furthermore, when the nasal epithelium was ablated using the detergent polidocanol, the mice showed a transient loss of luciferase expression that returned 2 weeks after administration, suggesting that our vectors transduced a progenitor cell population. Importantly, the hWDR65 element drove sufficient CFTR expression to correct the anion transport defect in CFTR-null epithelia. These results will guide the development of optimal vectors for sufficient, sustained CFTR expression in airway epithelia. PMID:22447971

  8. Duplication of floral regulatory genes in the Lamiales.

    PubMed

    Aagaard, Jan E; Olmstead, Richard G; Willis, John H; Phillips, Patrick C

    2005-08-01

    Duplication of some floral regulatory genes has occurred repeatedly in angiosperms, whereas others are thought to be single-copy in most lineages. We selected three genes that interact in a pathway regulating floral development conserved among higher tricolpates (LFY/FLO, UFO/FIM, and AP3/DEF) and screened for copy number among families of Lamiales that are closely related to the model species Antirrhinum majus. We show that two of three genes have duplicated at least twice in the Lamiales. Phylogenetic analyses of paralogs suggest that an ancient whole genome duplication shared among many families of Lamiales occurred after the ancestor of these families diverged from the lineage leading to Veronicaceae (including the single-copy species A. majus). Duplication is consistent with previous patterns among angiosperm lineages for AP3/DEF, but this is the first report of functional duplicate copies of LFY/FLO outside of tetraploid species. We propose Lamiales taxa will be good models for understanding mechanisms of duplicate gene preservation and how floral regulatory genes may contribute to morphological diversity. PMID:21646149

  9. C DNA SEQUENCE OF CHANNEL CATFISH PEROXIREDOXIN 6 GENE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Peroxiredoxin 6 gene (Prdx6) of channel catfish, Ictalurus punctatus, was cloned and sequenced. Total RNA from channel catfish tissues was isolated, reverse transcribed and amplified. The sequence of the channel catfish Prdx6 gene consists of 1003 nucleotides. Analysis of the nucleotide sequence ...

  10. Exceptionally high heterologous protein levels in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Angenon, Geert; Depicker, Ann

    2003-01-01

    Seeds are concentrated sources of protein and thus may be ideal 'bioreactors' for the production of heterologous proteins. For this application, strong seed-specific expression signals are required. A set of expression cassettes were designed using 5' and 3' regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) from Phaseolus vulgaris, and evaluated for the production of heterologous proteins in dicotyledonous plant species. A murine single-chain variable fragment (scFv) was chosen as model protein because of the current industrial interest to produce antibodies and derived fragments in crops. Because the highest scFv accumulation in seed had previously been achieved in the endoplasmic reticulum (ER), the scFv-encoding sequence was provided with signal sequences for accumulation in the ER. Transgenic Arabidopsis seed stocks, expressing the scFv under control of the 35S promoter, contained scFv accumulation levels in the range of 1% of total soluble protein (TSP). However, the seed storage promoter constructs boosted the scFv to exceptionally high levels. Maximum scFv levels were obtained in homozygous seed stocks, being 12.5% of TSP under control of the arc5-I regulatory sequences and even up to 36.5% of TSP upon replacing the arc5-I promoter by the beta-phaseolin promoter of Phaseolus vulgaris. Even at such very high levels, the scFv proteins retain their full antigen-binding activity. Moreover, the presence of very high scFv levels has only minory effects on seed germination and no effect on seed production. These results demonstrate that the expression levels of arcelin 5-I and beta-phaseolin seed storage protein genes can be transferred to heterologous proteins, giving exceptionally high levels of heterologous proteins, which can be of great value for the molecular farming industry by raising production yield and lowering bio-mass production and purification costs. Finally, the feasibility of heterologous protein production using the

  11. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  12. Genome-wide analysis reveals regulatory role of G4 DNA in gene transcription

    PubMed Central

    Du, Zhuo; Zhao, Yiqiang; Li, Ning

    2008-01-01

    G-quadruplex or G4 DNA, a four-stranded DNA structure formed in G-rich sequences, has been hypothesized to be a structural motif involved in gene regulation. In this study, we examined the regulatory role of potential G4 DNA motifs (PG4Ms) located in the putative transcriptional regulatory region (TRR, –500 to +500) of genes across the human genome. We found that PG4Ms in the 500-bp region downstream of the annotated transcription start site (TSS; PG4MD500) are associated with gene expression. Generally, PG4MD500-positive genes are expressed at higher levels than PG4MD500-negative genes, and an increased number of PG4MD500 provides a cumulative effect. This observation was validated by controlling for attributes, including gene family, function, and promoter similarity. We also observed an asymmetric pattern of PG4MD500 distribution between strands, whereby the frequency of PG4MD500 in the coding strand is generally higher than that in the template strand. Further analysis showed that the presence of PG4MD500 and its strand asymmetry are associated with significant enrichment of RNAP II at the putative TRR. On the basis of these results, we propose a model of G4 DNA-mediated stimulation of transcription with the hypothesis that PG4MD500 contributes to gene transcription by maintaining the DNA in an open conformation, while the asymmetric distribution of PG4MD500 considerably reduces the probability of blocking the progression of the RNA polymerase complex on the template strand. Our findings provide a comprehensive view of the regulatory function of G4 DNA in gene transcription. PMID:18096746

  13. GSEL version 2, an online genome-wide query system of operon organization and regulatory sequence elements of Geobacter sulfurreducens.

    PubMed

    Qu, Yanhua; Brown, Peter; Barbe, Jose F; Puljic, Marko; Merino, Enrique; Adkins, Ronald M; Lovley, Derek R; Krushkal, Julia

    2009-10-01

    Geobacter sulfurreducens is a model organism within the delta-Proteobacterial family Geobacteraceae, members of which can participate in environmental bioremediation of metal and organic waste contaminants and in production of bioenergy. In this report, we describe a new, significantly expanded and updated, version 2 of the GSEL (Geobacter Sequence Elements) database ( http://geobacter.org/research/gsel2/ and http://geobacter.org/refs/gsel2/ ) and its accompanying online query system, which compiles information on operon organization and regulatory sequence elements in the genome of G. sulfurreducens. It incorporates a new online graphical browser, provides novel search capabilities, and includes updated operon predictions along with new information on predicted and experimentally validated genome regulatory sites. The GSEL database and online search system provides a unique and comprehensive tool cataloging information about gene regulation in G. sulfurreducens, aiding in investigation of mechanisms that regulate its ability to generate electric power, bioremediate environmental waste, and adapt to environmental changes. PMID:19792871

  14. Regulatory elements responsible for inducible expression of the granulocyte colony-stimulating factor gene in macrophages.

    PubMed Central

    Nishizawa, M; Nagata, S

    1990-01-01

    Granulocyte colony-stimulating factor (G-CSF) plays an essential role in granulopoiesis during bacterial infection. Macrophages produce G-CSF in response to bacterial endotoxins such as lipopolysaccharide (LPS). To elucidate the mechanism of the induction of G-CSF gene in macrophages or macrophage-monocytes, we have examined regulatory cis elements in the promoter of mouse G-CSF gene. Analyses of linker-scanning and internal deletion mutants of the G-CSF promoter by the chloramphenicol acetyltransferase assay have indicated that at least three regulatory elements are indispensable for the LPS-induced expression of the G-CSF gene in macrophages. When one of the three elements was reiterated and placed upstream of the TATA box of the G-CSF promoter, it mediated inducibility as a tissue-specific and orientation-independent enhancer. Although this element contains a conserved NF-kappa B-like binding site, the gel retardation assay and DNA footprint analysis with nuclear extracts from macrophage cell lines demonstrated that nuclear proteins bind to the DNA sequence downstream of the NF-kappa B-like element, but not to the conserved element itself. The DNA sequence of the binding site was found to have some similarities to the LPS-responsive element which was recently identified in the promoter of the mouse class II major histocompatibility gene. Images PMID:1691438

  15. Organisation of regulatory elements in two closely spaced Drosophila genes with common expression characteristics.

    PubMed

    Gigliotti, S; Balz, V; Malva, C; Schäfer, M A

    1997-11-01

    Sperm tail proteins that are components of a specific structure formed late during spermatid elongation have been found to be encoded by the Mst(3)CGP gene family. These genes have been demonstrated to be regulated both at the transcriptional as well as at the translational level. We report here on the dissection of the regulatory regions for two members of the gene family, Mst84Da and Mst84Db. While high level transcription and negative translational control of Mst84Da is mediated by a short gene segment of 205 nt (-152/+53), Mst84Db expression is controlled by a number of distinct regulatory elements with different effects that all reside within the gene itself. We identify a transcriptional control element between +154 and +216, a translational repression element around +216 to +275 and an RNA stability element within the 3'UTR. Irrespective of the final common expression characteristics, correct regulation for any individual member of the gene family seems to be achieved by very different means. This confirms earlier observations that did not detect any other sequence elements in common apart from the TCE (translational control element). PMID:9431808

  16. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    PubMed

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  17. Gene and translation initiation site prediction in metagenomic sequences

    SciTech Connect

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John; Uberbacher, Edward C

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  18. Identification of genes in genomic and EST sequences

    SciTech Connect

    Fields, C.; Adams, M.D.; Kerlavage, A.R.; Dubnick, M.; McCombie, W.R.; Martin-Gallardo, A.; Venter, J.C.; White, O.

    1993-12-31

    Currently-available software tools are capable of predicting the locations of most protein-coding genes in anonymous genomic DNA sequences. The use of predicted exxon to select primers for PCR amplification from cDNA libraries allows the complete structures of novel genes to be determined efficiently. As the number of expressed sequence tag (EST) sequences increases, the fraction of genes that can be localized in genomic sequences by searching EST databases will rapidly approach unity. The challenge for automated DNA sequence analysis is now to develop methods for accurately predicting gene structure and alternative splicing patterns. Substantially improving current accuracies in gene structure prediction will require retrospective comparative analysis of sequences from different organisms and gene families.

  19. Partitioning of genetic variation between regulatory and coding gene segments: the predominance of software variation in genes encoding introvert proteins.

    PubMed

    Mitchison, A

    1997-01-01

    In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics. PMID:9148788

  20. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    2002-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  1. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    1999-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  2. Predictive modelling of gene expression from transcriptional regulatory elements.

    PubMed

    Budden, David M; Hurley, Daniel G; Crampin, Edmund J

    2015-07-01

    Predictive modelling of gene expression provides a powerful framework for exploring the regulatory logic underpinning transcriptional regulation. Recent studies have demonstrated the utility of such models in identifying dysregulation of gene and miRNA expression associated with abnormal patterns of transcription factor (TF) binding or nucleosomal histone modifications (HMs). Despite the growing popularity of such approaches, a comparative review of the various modelling algorithms and feature extraction methods is lacking. We define and compare three methods of quantifying pairwise gene-TF/HM interactions and discuss their suitability for integrating the heterogeneous chromatin immunoprecipitation (ChIP)-seq binding patterns exhibited by TFs and HMs. We then construct log-linear and ϵ-support vector regression models from various mouse embryonic stem cell (mESC) and human lymphoblastoid (GM12878) data sets, considering both ChIP-seq- and position weight matrix- (PWM)-derived in silico TF-binding. The two algorithms are evaluated both in terms of their modelling prediction accuracy and ability to identify the established regulatory roles of individual TFs and HMs. Our results demonstrate that TF-binding and HMs are highly predictive of gene expression as measured by mRNA transcript abundance, irrespective of algorithm or cell type selection and considering both ChIP-seq and PWM-derived TF-binding. As we encourage other researchers to explore and develop these results, our framework is implemented using open-source software and made available as a preconfigured bootable virtual environment. PMID:25231769

  3. Brain-specific genes have identifier sequences in their introns.

    PubMed Central

    Milner, R J; Bloom, F E; Lai, C; Lerner, R A; Sutcliffe, J G

    1984-01-01

    The 82-nucleotide identifier (ID) sequence is present in the rat genome in 1-1.5 X 10(5) copies and in cDNA clones of precursors of brain-specific mRNAs. One brain-specific gene contains more than one ID sequence in its introns. There is an excess of ID sequences to brain genes, and some ID sequences appear to have been inserted as mobile elements into other genetic locations. Therefore, brain genes contain ID sequences in their introns, but not all ID sequences are located in brain gene introns. A brain ID consensus sequence has been obtained by comparing 8 ID nucleotide sequences. Images PMID:6583673

  4. Proximal and distal sequences control UV cone pigment gene expression in transgenic zebrafish.

    PubMed

    Luo, Wenqin; Williams, John; Smallwood, Philip M; Touchman, Jeffrey W; Roman, Laura M; Nathans, Jeremy

    2004-04-30

    The molecular basis of cone photoreceptor-specific gene expression is largely unknown. In this study, we define cis-acting DNA sequences that control the cell type-specific expression of the zebrafish UV cone pigment gene by transient expression of green fluorescent protein transgenes following their injection into zebrafish embryos. These experiments show that 4.8 kb of 5'-flanking sequences from the zebrafish UV pigment gene direct expression specifically to UV cones and that this activity requires both distal and proximal sequences. In addition, we demonstrate that a proximal region located between -215 and -110 bp (with respect to the initiator methionine codon) can function in the context of a zebrafish rhodopsin promotor to convert its specificity from rod-only expression to rod and UV cone expression. These experiments demonstrate the power of transient transgenesis in zebrafish to efficiently define cis-acting regulatory sequences in an intact vertebrate. PMID:14966125

  5. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. PMID:23746735

  6. Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation

    PubMed Central

    Rouault, Hervé; Santolini, Marc; Schweisguth, François; Hakim, Vincent

    2014-01-01

    Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages. PMID:24682824

  7. Noise Control in Gene Regulatory Networks with Negative Feedback.

    PubMed

    Hinczewski, Michael; Thirumalai, D

    2016-07-01

    Genes and proteins regulate cellular functions through complex circuits of biochemical reactions. Fluctuations in the components of these regulatory networks result in noise that invariably corrupts the signal, possibly compromising function. Here, we create a practical formalism based on ideas introduced by Wiener and Kolmogorov (WK) for filtering noise in engineered communications systems to quantitatively assess the extent to which noise can be controlled in biological processes involving negative feedback. Application of the theory, which reproduces the previously proven scaling of the lower bound for noise suppression in terms of the number of signaling events, shows that a tetracycline repressor-based negative-regulatory gene circuit behaves as a WK filter. For the class of Hill-like nonlinear regulatory functions, this type of filter provides the optimal reduction in noise. Our theoretical approach can be readily combined with experimental measurements of response functions in a wide variety of genetic circuits, to elucidate the general principles by which biological networks minimize noise. PMID:27095600

  8. Propagation of genetic variation in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Plahte, Erik; Gjuvsland, Arne B.; Omholt, Stig W.

    2013-08-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network’s feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

  9. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  10. From genes to shape: regulatory interactions in leaf development.

    PubMed

    Barkoulas, Michalis; Galinha, Carla; Grigg, Stephen P; Tsiantis, Miltos

    2007-12-01

    In the past two years novel connections were described between auxin function and transcription factor patterning systems involved in both leaf initiation and elaboration of leaf axial patterning. A cascade of small RNA-based regulatory steps was suggested to facilitate delimitation of cell types comprising the upper versus lower parts of the leaf. Developmental regulation of cellular growth emerged as a crucial component in regulation of leaf form with TCP and CUC2 transcription factors playing a key role in this process. Finally, cis-regulatory evolution of developmental genes emerged as a process that likely contributed to diversification of leaf form, while studies in seedless land plants have begun to elucidate the ancestral and derived aspects of leaf development pathways. PMID:17869569

  11. Stable intronic sequence RNAs have possible regulatory roles in Drosophila melanogaster

    PubMed Central

    Osman, Ismail; Tay, Mandy Li-Ian; Zheng, Ruther Teo

    2015-01-01

    Stable intronic sequence RNAs (sisRNAs) have been found in Xenopus tropicalis, human cell lines, and Epstein-Barr virus; however, the biological significance of sisRNAs remains poorly understood. We identify sisRNAs in Drosophila melanogaster by deep sequencing, reverse transcription polymerase chain reaction, and Northern blotting. We characterize a sisRNA (sisR-1) from the regena (rga) locus and show that it can be processed from the precursor messenger RNA (pre-mRNA). We also document a cis-natural antisense transcript (ASTR) from the rga locus, which is highly expressed in early embryos. During embryogenesis, ASTR promotes robust rga pre-mRNA expression. Interestingly, sisR-1 represses ASTR, with consequential effects on rga pre-mRNA expression. Our results suggest a model in which sisR-1 modulates its host gene expression by repressing ASTR during embryogenesis. We propose that sisR-1 belongs to a class of sisRNAs with probable regulatory activities in Drosophila. PMID:26504165

  12. Colorectal cancer risk genes are functionally enriched in regulatory pathways

    PubMed Central

    Lu, Xi; Cao, Mingming; Han, Su; Yang, Youlin; Zhou, Jin

    2016-01-01

    Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. However, the potential genetic mechanisms for newly identified CRC susceptibility variants are still unclear. Here, we selected 85 CRC susceptibility variants with suggestive association P < 1.00E-05 from the National Human Genome Research Institute GWAS catalog. To investigate the underlying genetic pathways where these newly identified CRC susceptibility genes are significantly enriched, we conducted a functional annotation. Using two kinds of SNP to gene mapping methods including the nearest upstream and downstream gene method and the ProxyGeneLD, we got 128 unique CRC susceptibility genes. We then conducted a pathway analysis in GO database using the corresponding 128 genes. We identified 44 GO categories, 17 of which are regulatory pathways. We believe that our results may provide further insight into the underlying genetic mechanisms for these newly identified CRC susceptibility variants. PMID:27146020

  13. Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data

    PubMed Central

    Liu, Zhi-Ping

    2015-01-01

    Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented. PMID:25937810

  14. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks.

    PubMed

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-06-01

    The diverse, specialized genes present in today's lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins' binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes' evolutionary properties. Slowly evolving ("cold"), old genes tend to interact with each other, as do rapidly evolving ("hot"), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN's community structures and its genes' evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  15. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  16. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes

    PubMed Central

    Xiao, Fei; Gao, Lin; Ye, Yusen; Hu, Yuxuan; He, Ruijie

    2016-01-01

    Combining path consistency (PC) algorithms with conditional mutual information (CMI) are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference), to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise. PMID:27171286

  17. Prediction and Validation of Gene Regulatory Elements Activated During Retinoic Acid Induced Embryonic Stem Cell Differentiation.

    PubMed

    Simandi, Zoltan; Horvath, Attila; Nagy, Peter; Nagy, Laszlo

    2016-01-01

    Embryonic development is a multistep process involving activation and repression of many genes. Enhancer elements in the genome are known to contribute to tissue and cell-type specific regulation of gene expression during the cellular differentiation. Thus, their identification and further investigation is important in order to understand how cell fate is determined. Integration of gene expression data (e.g., microarray or RNA-seq) and results of chromatin immunoprecipitation (ChIP)-based genome-wide studies (ChIP-seq) allows large-scale identification of these regulatory regions. However, functional validation of cell-type specific enhancers requires further in vitro and in vivo experimental procedures. Here we describe how active enhancers can be identified and validated experimentally. This protocol provides a step-by-step workflow that includes: 1) identification of regulatory regions by ChIP-seq data analysis, 2) cloning and experimental validation of putative regulatory potential of the identified genomic sequences in a reporter assay, and 3) determination of enhancer activity in vivo by measuring enhancer RNA transcript level. The presented protocol is detailed enough to help anyone to set up this workflow in the lab. Importantly, the protocol can be easily adapted to and used in any cellular model system. PMID:27403939

  18. Using shotgun sequence data to find active restriction enzyme genes.

    PubMed

    Zheng, Yu; Posfai, Janos; Morgan, Richard D; Vincze, Tamas; Roberts, Richard J

    2009-01-01

    Whole genome shotgun sequence analysis has become the standard method for beginning to determine a genome sequence. The preparation of the shotgun sequence clones is, in fact, a biological experiment. It determines which segments of the genome can be cloned into Escherichia coli and which cannot. By analyzing the complete set of sequences from such an experiment, it is possible to identify genes lethal to E. coli. Among this set are genes encoding restriction enzymes which, when active in E. coli, lead to cell death by cleaving the E. coli genome at the restriction enzyme recognition sites. By analyzing shotgun sequence data sets we show that this is a reliable method to detect active restriction enzyme genes in newly sequenced genomes, thereby facilitating functional annotation. Active restriction enzyme genes have been identified, and their activity demonstrated biochemically, in the sequenced genomes of Methanocaldococcus jannaschii, Bacillus cereus ATCC 10987 and Methylococcus capsulatus. PMID:18988632

  19. Nucleotide sequence of the gene for human prothrombin

    SciTech Connect

    Degen, S.J.F.; Davie, E.W.

    1987-09-22

    A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were the compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.

  20. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  1. Sequence Requirements for Myosin Gene Expression and Regulation in Caenorhabditis Elegans

    PubMed Central

    Okkema, P. G.; Harrison, S. W.; Plunger, V.; Aryana, A.; Fire, A.

    1993-01-01

    Four Caenorhabditis elegans genes encode muscle-type specific myosin heavy chain isoforms: myo-1 and myo-2 are expressed in the pharyngeal muscles; unc-54 and myo-3 are expressed in body wall muscles. We have used transformation-rescue and lacZ fusion assays to determine sequence requirements for regulated myosin gene expression during development. Multiple tissue-specific activation elements are present for all four genes. For each of the four genes, sequences upstream of the coding region are tissue-specific promoters, as shown by their ability to drive expression of a reporter gene (lacZ) in the appropriate muscle type. Each gene contains at least one additional tissue-specific regulatory element, as defined by the ability to enhance expression of a heterologous promoter in the appropriate muscle type. In rescue experiments with unc-54, two further requirements apparently independent of tissue specificity were found: sequences within the 3' non-coding region are essential for activity while an intron near the 5' end augments expression levels. The general intron stimulation is apparently independent of intron sequence, indicating a mechanistic effect of splicing. To further characterize the myosin gene promoters and to examine the types of enhancer sequences in the genome, we have initiated a screen of C. elegans genomic DNA for fragments capable of enhancing the myo-2 promoter. The properties of enhancers recovered from this screen suggest that the promoter is limited to muscle cells in its ability to respond to enhancers. PMID:8244003

  2. The human actin-regulatory protein Cap G: Gene structure and chromosome location

    SciTech Connect

    Mishra, V.S.; Southwick, F.S.; Henske, E.P.; Kwiatkowski, D.J.

    1994-10-01

    Cap G (formerly called macrophage capping protein or gCap39) is a member of the gelsolin/villin family of actin-regulatory proteins. Unlike all other members of this family, Cap G caps the barbed ends of actin filaments, but does not sever them. This protein is half the molecular weight and contains half the number of repeat subunits (3 vs. 6) of gelsolin and villin, suggesting that these two proteins may have arisen by gene duplication of the Cap G gene. To investigate this possibility we have cloned and sequenced the human Cap G gene (gene symbol CAPG). The gene is 16.6 kb in size, contains 10 exons and 9 introns, and is located on the proximal short arm of chromosome 2. The open reading frame is 6.9 kb, having 9 exons and 8 introns. This region contains 3 splice sites that are nearly identical to the human gelsolin gene, but shares only one with villin, indicating that CAPG is more closely related to gelsolin. Further comparisons of these three genes, however, indicate that the evolutionary steps resulting in human gelsolin and villin are likely to have been more complex than a simple tandem duplication of the Cap G gene. 30 refs., 4 figs., 2 tabs.

  3. Redeployment of a conserved gene regulatory network during Aedes aegypti development.

    PubMed

    Suryamohan, Kushal; Hanson, Casey; Andrews, Emily; Sinha, Saurabh; Scheel, Molly Duman; Halfon, Marc S

    2016-08-15

    Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system. PMID:27341759

  4. Identification of a DNA methylation-dependent activator sequence in the pseudoxanthoma elasticum gene, ABCC6.

    PubMed

    Arányi, Tamás; Ratajewski, Marcin; Bardóczy, Viola; Pulaski, Lukasz; Bors, András; Tordai, Attila; Váradi, András

    2005-05-13

    ABCC6 encodes MRP6, a member of the ABC protein family with an unknown physiological role. The human ABCC6 and its two pseudogenes share 99% identical DNA sequence. Loss-of-function mutations of ABCC6 are associated with the development of pseudoxanthoma elasticum (PXE), a recessive hereditary disorder affecting the elastic tissues. Various disease-causing mutations were found in the coding region; however, the mutation detection rate in the ABCC6 coding region of bona fide PXE patients is only approximately 80%. This suggests that polymorphisms or mutations in the regulatory regions may contribute to the development of the disease. Here, we report the first characterization of the ABCC6 gene promoter. Phylogenetic in silico analysis of the 5' regulatory regions revealed the presence of two evolutionarily conserved sequence elements embedded in CpG islands. The study of DNA methylation of ABCC6 and the pseudogenes identified a correlation between the methylation of the CpG island in the proximal promoter and the ABCC6 expression level in cell lines. Both activator and repressor sequences were uncovered in the proximal promoter by reporter gene assays. The most potent activator sequence was one of the conserved elements protected by DNA methylation on the endogenous gene in non-expressing cells. Finally, in vitro methylation of this sequence inhibits the transcriptional activity of the luciferase promoter constructs. Altogether these results identify a DNA methylation-dependent activator sequence in the ABCC6 promoter. PMID:15760889

  5. Prediction of Regulatory Interactions from Genome Sequences Using a Biophysical Model for the Arabidopsis LEAFY Transcription Factor[C][W

    PubMed Central

    Moyroud, Edwige; Minguet, Eugenio Gómez; Ott, Felix; Yant, Levi; Posé, David; Monniaux, Marie; Blanchet, Sandrine; Bastien, Olivier; Thévenon, Emmanuel; Weigel, Detlef; Schmid, Markus; Parcy, François

    2011-01-01

    Despite great advances in sequencing technologies, generating functional information for nonmodel organisms remains a challenge. One solution lies in an improved ability to predict genetic circuits based on primary DNA sequence in combination with detailed knowledge of regulatory proteins that have been characterized in model species. Here, we focus on the LEAFY (LFY) transcription factor, a conserved master regulator of floral development. Starting with biochemical and structural information, we built a biophysical model describing LFY DNA binding specificity in vitro that accurately predicts in vivo LFY binding sites in the Arabidopsis thaliana genome. Applying the model to other plant species, we could follow the evolution of the regulatory relationship between LFY and the AGAMOUS (AG) subfamily of MADS box genes and show that this link predates the divergence between monocots and eudicots. Remarkably, our model succeeds in detecting the connection between LFY and AG homologs despite extensive variation in binding sites. This demonstrates that the cis-element fluidity recently observed in animals also exists in plants, but the challenges it poses can be overcome with predictions grounded in a biophysical model. Therefore, our work opens new avenues to deduce the structure of regulatory networks from mere inspection of genomic sequences. PMID:21515819

  6. Ensemble Inference and Inferability of Gene Regulatory Networks

    PubMed Central

    Ud-Dean, S. M. Minhaz; Gunawan, Rudiyanto

    2014-01-01

    The inference of gene regulatory network (GRN) from gene expression data is an unsolved problem of great importance. This inference has been stated, though not proven, to be underdetermined implying that there could be many equivalent (indistinguishable) solutions. Motivated by this fundamental limitation, we have developed new framework and algorithm, called TRaCE, for the ensemble inference of GRNs. The ensemble corresponds to the inherent uncertainty associated with discriminating direct and indirect gene regulations from steady-state data of gene knock-out (KO) experiments. We applied TRaCE to analyze the inferability of random GRNs and the GRNs of E. coli and yeast from single- and double-gene KO experiments. The results showed that, with the exception of networks with very few edges, GRNs are typically not inferable even when the data are ideal (unbiased and noise-free). Finally, we compared the performance of TRaCE with top performing methods of DREAM4 in silico network inference challenge. PMID:25093509

  7. Diverse Gene Expression in Human Regulatory T Cell Subsets Uncovers Connection between Regulatory T Cell Genes and Suppressive Function.

    PubMed

    Hua, Jing; Davis, Scott P; Hill, Jonathan A; Yamagata, Tetsuya

    2015-10-15

    Regulatory T (Treg) cells have a critical role in the control of immunity, and their diverse subpopulations may allow adaptation to different types of immune responses. In this study, we analyzed human Treg cell subpopulations in the peripheral blood by performing genome-wide expression profiling of 40 Treg cell subsets from healthy donors. We found that the human peripheral blood Treg cell population is comprised of five major genomic subgroups, represented by 16 tractable subsets with a particular cell surface phenotype. These subsets possess a range of suppressive function and cytokine secretion and can exert a genomic footprint on target effector T (Teff) cells. Correlation analysis of variability in gene expression in the subsets identified several cell surface molecules associated with Treg suppressive function, and pharmacological interrogation revealed a set of genes having causative effect. The five genomic subgroups of Treg cells imposed a preserved pattern of gene expression on Teff cells, with a varying degree of genes being suppressed or induced. Notably, there was a cluster of genes induced by Treg cells that bolstered an autoinhibitory effect in Teff cells, and this induction appears to be governed by a different set of genes than ones involved in counteracting Teff activation. Our work shows an example of exploiting the diversity within human Treg cell subpopulations to dissect Treg cell biology. PMID:26371251

  8. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification.

    PubMed

    Li, Cong-Jun; Li, Robert W; Baldwin, Ransom L; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  9. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification

    PubMed Central

    Li, Cong-Jun; Li, Robert W.; Baldwin, Ransom L.; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  10. Recognition of Yeast Species from Gene Sequence Comparisons

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...

  11. Reclassification of ascomycetous yeasts from gene sequence analyses

    Technology Transfer Automated Retrieval System (TEKTRAN)

    During the past decade, identification of yeasts and their classification has been based almost exclusively on gene sequence analysis. Primarily as a result of using diagnostic gene sequences, such as D1/D2 LSU and ITS ribosomal RNAs, the number of known species has doubled. With the faster sequen...

  12. Inference of Gene Regulatory Network Based on Local Bayesian Networks

    PubMed Central

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan

    2016-01-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  13. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    PubMed

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  14. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks

    PubMed Central

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-01-01

    The diverse, specialized genes present in today’s lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins’ binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes’ evolutionary properties. Slowly evolving (“cold”), old genes tend to interact with each other, as do rapidly evolving (“hot”), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN’s community structures and its genes’ evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  15. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato.

    PubMed

    Fischer, Iris; Steige, Kim A; Stephan, Wolfgang; Mboup, Mamadou

    2013-01-01

    The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149

  16. Sequence Evolution and Expression Regulation of Stress-Responsive Genes in Natural Populations of Wild Tomato

    PubMed Central

    Fischer, Iris; Steige, Kim A.; Stephan, Wolfgang; Mboup, Mamadou

    2013-01-01

    The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149

  17. A Trans-Acting Regulatory Gene That Inversely Affects the Expression of the White, Brown and Scarlet Loci in Drosophila

    PubMed Central

    Rabinow, L.; Nguyen-Huynh, A. T.; Birchler, J. A.

    1991-01-01

    A trans-acting regulatory gene, Inr-a, that alters the level of expression of the white eye color locus as an inverse function of the number of its functional copies is described. Several independent lines of evidence demonstrate that this regulatory gene interacts with white via the promoter sequences. Among these are the observations that the inverse regulatory effect is conferred to the Adh gene when fused to the white promoter and that cis-regulatory mutants of white fail to respond. The phenotypic response to Inr-a is found in all tissues in which white is expressed, and mutants of the regulator exhibit a recessive lethality during larval periods. Increased white messenger RNA levels in pupal stages are found in Inr-a/+ individuals versus +/+ and a coordinate response is observed for mRNA levels from the brown and scarlet loci. All are structurally related and participate in pigment deposition. These experiments demonstrate that a single regulatory gene can exert an inverse effect on a target structural locus, a situation postulated from segmental aneuploid studies of gene expression and dosage compensation. PMID:1743487

  18. The molecular evolution of terminal ear1, a regulatory gene in the genus Zea.

    PubMed Central

    White, S E; Doebley, J F

    1999-01-01

    Nucleotide diversity in the terminal ear1 (te1) gene, a regulatory locus hypothesized to be involved in the morphological evolution of maize (Zea mays ssp. mays), was investigated for evidence of past selection. Nucleotide polymorphism in a 1.4-kb region of te1 was analyzed for a sample of 26 sequences isolated from 12 maize lines, five populations of the maize progenitor, Z. mays ssp. parviglumis, six other Zea populations, and two Tripsacum species. Although nucleotide diversity in te1 in maize is reduced relative to ssp. parviglumis, phylogenetic and statistical analyses of the pattern of polymorphism among these sequences provided no evidence of past selection, indicating that the region of the gene studied was probably not involved in maize evolution. The level of reduction in genetic diversity in te1 in maize relative to its progenitor is comparable to that found in previous reports for isozymes and other neutrally evolving maize genes and is consistent with a genome-wide reduction of genetic diversity resulting from a domestication bottleneck. An estimate of the age (1.2-1.4 million yr) of the maize gene pool based on te1 is roughly consistent with previous estimates based on other neutral genes, but may be biased by the apparently slow synonymous substitution rate at te1. PMID:10545473

  19. Fusion genes and their discovery using high throughput sequencing.

    PubMed

    Annala, M J; Parker, B C; Zhang, W; Nykter, M

    2013-11-01

    Fusion genes are hybrid genes that combine parts of two or more original genes. They can form as a result of chromosomal rearrangements or abnormal transcription, and have been shown to act as drivers of malignant transformation and progression in many human cancers. The biological significance of fusion genes together with their specificity to cancer cells has made them into excellent targets for molecular therapy. Fusion genes are also used as diagnostic and prognostic markers to confirm cancer diagnosis and monitor response to molecular therapies. High-throughput sequencing has enabled the systematic discovery of fusion genes in a wide variety of cancer types. In this review, we describe the history of fusion genes in cancer and the ways in which fusion genes form and affect cellular function. We also describe computational methodologies for detecting fusion genes from high-throughput sequencing experiments, and the most common sources of error that lead to false discovery of fusion genes. PMID:23376639

  20. Neurogenic gene regulatory pathways in the sea urchin embryo.

    PubMed

    Wei, Zheng; Angerer, Lynne M; Angerer, Robert C

    2016-01-15

    During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo. PMID:26657764

  1. Regulatory network of microRNAs, target genes, transcription factors and host genes in endometrial cancer.

    PubMed

    Xue, Lu-Chen; Xu, Zhi-Wen; Wang, Kun-Hao; Wang, Ning; Zhang, Xiao-Xu; Wang, Shang

    2015-01-01

    Genes and microRNAs (miRNAs) have important roles in human oncology. However, most of the biological factors are reported in disperse form which makes it hard to discover the pathology. In this study, genes and miRNAs involved in human endometrial cancer(EC) were collected and formed into regulatory networks following their interactive relations, including miRNAs targeting genes, transcription factors (TFs) regulating miRNAs and miRNAs included in their host genes. Networks are constructed hierarchically at three levels: differentially expressed, related and global. Among the three, the differentially expressed network is the most important and fundamental network that contains the key genes and miRNAs in EC. The target genes, TFs and miRNAs are differentially expressed in EC so that any mutation in them may impact on EC development. Some key pathways in networks were highlighted to analyze how they interactively influence other factors and carcinogenesis. Upstream and downstream pathways of the differentially expressed genes and miRNAs were compared and analyzed. The purpose of this study was to partially reveal the deep regulatory mechanisms in EC using a new method that combines comprehensive genes and miRNAs together with their relationships. It may contribute to cancer prevention and gene therapy of EC. PMID:25684474

  2. Regulatory Oversight of Cell and Gene Therapy Products in Canada.

    PubMed

    Ridgway, Anthony; Agbanyo, Francisca; Wang, Jian; Rosu-Myles, Michael

    2015-01-01

    Health Canada regulates gene therapy products and many cell therapy products as biological drugs under the Canadian Food and Drugs Act and its attendant regulations. Cellular products that meet certain criteria, including minimal manipulation and homologous use, may be subjected to a standards-based approach under the Safety of Human Cells, Tissues and Organs for Transplantation Regulations. The manufacture and clinical testing of cell and gene therapy products (CGTPs) presents many challenges beyond those for protein biologics. Cells cannot be subjected to pathogen removal or inactivation procedures and must frequently be administered shortly after final formulation. Viral vector design and manufacturing control are critically important to overall product quality and linked to safety and efficacy in patients through concerns such as replication competence, vector integration, and vector shedding. In addition, for many CGTPs, the value of nonclinical studies is largely limited to providing proof of concept, and the first meaningful data relating to appropriate dosing, safety parameters, and validity of surrogate or true determinants of efficacy must come from carefully designed clinical trials in patients. Addressing these numerous challenges requires application of various risk mitigation strategies and meeting regulatory expectations specifically adapted to the product types. Regulatory cooperation and harmonisation at an international level are essential for progress in the development and commercialisation of these products. However, particularly in the area of cell therapy, new regulatory paradigms may be needed to harness the benefits of clinical progress in situations where the resources and motivation to pursue a typical drug product approval pathway may be lacking. PMID:26374212

  3. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  4. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

    PubMed Central

    Li, Mingjun

    2014-01-01

    Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs) with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs) associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb) of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes. PMID:25250331

  5. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis.

    PubMed

    Oyserman, Ben O; Noguera, Daniel R; del Rio, Tijana Glavina; Tringe, Susannah G; McMahon, Katherine D

    2016-04-01

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. This analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms. PMID:26555245

  6. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

    PubMed Central

    Oyserman, Ben O; Noguera, Daniel R; del Rio, Tijana Glavina; Tringe, Susannah G; McMahon, Katherine D

    2016-01-01

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. This analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms. PMID:26555245

  7. Construction of gene regulatory networks using biclustering and bayesian networks

    PubMed Central

    2011-01-01

    Background Understanding gene interactions in complex living systems can be seen as the ultimate goal of the systems biology revolution. Hence, to elucidate disease ontology fully and to reduce the cost of drug development, gene regulatory networks (GRNs) have to be constructed. During the last decade, many GRN inference algorithms based on genome-wide data have been developed to unravel the complexity of gene regulation. Time series transcriptomic data measured by genome-wide DNA microarrays are traditionally used for GRN modelling. One of the major problems with microarrays is that a dataset consists of relatively few time points with respect to the large number of genes. Dimensionality is one of the interesting problems in GRN modelling. Results In this paper, we develop a biclustering function enrichment analysis toolbox (BicAT-plus) to study the effect of biclustering in reducing data dimensions. The network generated from our system was validated via available interaction databases and was compared with previous methods. The results revealed the performance of our proposed method. Conclusions Because of the sparse nature of GRNs, the results of biclustering techniques differ significantly from those of previous methods. PMID:22018164

  8. Identification of a gene regulatory network associated with prion replication

    PubMed Central

    Marbiah, Masue M; Harvey, Anna; West, Billy T; Louzolo, Anais; Banerjee, Priya; Alden, Jack; Grigoriadis, Anita; Hummerich, Holger; Kan, Ho-Man; Cai, Ying; Bloom, George S; Jat, Parmjit; Collinge, John; Klöhn, Peter-Christian

    2014-01-01

    Prions consist of aggregates of abnormal conformers of the cellular prion protein (PrPC). They propagate by recruiting host-encoded PrPC although the critical interacting proteins and the reasons for the differences in susceptibility of distinct cell lines and populations are unknown. We derived a lineage of cell lines with markedly differing susceptibilities, unexplained by PrPC expression differences, to identify such factors. Transcriptome analysis of prion-resistant revertants, isolated from highly susceptible cells, revealed a gene expression signature associated with susceptibility and modulated by differentiation. Several of these genes encode proteins with a role in extracellular matrix (ECM) remodelling, a compartment in which disease-related PrP is deposited. Silencing nine of these genes significantly increased susceptibility. Silencing of Papss2 led to undersulphated heparan sulphate and increased PrPC deposition at the ECM, concomitantly with increased prion propagation. Moreover, inhibition of fibronectin 1 binding to integrin α8 by RGD peptide inhibited metalloproteinases (MMP)-2/9 whilst increasing prion propagation. In summary, we have identified a gene regulatory network associated with prion propagation at the ECM and governed by the cellular differentiation state. PMID:24843046

  9. Cross-Tissue Regulatory Gene Networks in Coronary Artery Disease.

    PubMed

    Talukdar, Husain A; Foroughi Asl, Hassan; Jain, Rajeev K; Ermel, Raili; Ruusalepp, Arno; Franzén, Oscar; Kidd, Brian A; Readhead, Ben; Giannarelli, Chiara; Kovacic, Jason C; Ivert, Torbjörn; Dudley, Joel T; Civelek, Mete; Lusis, Aldons J; Schadt, Eric E; Skogsberg, Josefin; Michoel, Tom; Björkegren, Johan L M

    2016-03-23

    Inferring molecular networks can reveal how genetic perturbations interact with environmental factors to cause common complex diseases. We analyzed genetic and gene expression data from seven tissues relevant to coronary artery disease (CAD) and identified regulatory gene networks (RGNs) and their key drivers. By integrating data from genome-wide association studies, we identified 30 CAD-causal RGNs interconnected in vascular and metabolic tissues, and we validated them with corresponding data from the Hybrid Mouse Diversity Panel. As proof of concept, by targeting the key drivers AIP, DRAP1, POLR2I, and PQBP1 in a cross-species-validated, arterial-wall RGN involving RNA-processing genes, we re-identified this RGN in THP-1 foam cells and independent data from CAD macrophages and carotid lesions. This characterization of the molecular landscape in CAD will help better define the regulation of CAD candidate genes identified by genome-wide association studies and is a first step toward achieving the goals of precision medicine. PMID:27135365

  10. Compartmentalized gene regulatory network of the pathogenic fungus Fusarium graminearum.

    PubMed

    Guo, Li; Zhao, Guoyi; Xu, Jin-Rong; Kistler, H Corby; Gao, Lixin; Ma, Li-Jun

    2016-07-01

    Head blight caused by Fusarium graminearum threatens world-wide wheat production, resulting in both yield loss and mycotoxin contamination. We reconstructed the global F. graminearum gene regulatory network (GRN) from a large collection of transcriptomic data using Bayesian network inference, a machine-learning algorithm. This GRN reveals connectivity between key regulators and their target genes. Focusing on key regulators, this network contains eight distinct but interwoven modules. Enriched for unique functions, such as cell cycle, DNA replication, transcription, translation and stress responses, each module exhibits distinct expression profiles. Evolutionarily, the F. graminearum genome can be divided into core regions shared with closely related species and variable regions harboring genes that are unique to F. graminearum and perform species-specific functions. Interestingly, the inferred top regulators regulate genes that are significantly enriched from the same genomic regions (P < 0.05), revealing a compartmentalized network structure that may reflect network rewiring related to specific adaptation of this plant pathogen. This first-ever reconstructed filamentous fungal GRN primes our understanding of pathogenicity at the systems biology level and provides enticing prospects for novel disease control strategies involving the targeting of master regulators in pathogens. The program can be used to construct GRNs of other plant pathogens. PMID:26990214

  11. Sequence homologies in the protamine gene family of rainbow trout.

    PubMed Central

    Aiken, J M; McKenzie, D; Zhao, H Z; States, J C; Dixon, G H

    1983-01-01

    We have sequenced five different rainbow trout protamine genes plus their flanking regions. The genes are not clustered and do not contain intervening sequences. There is an extremely high degree of sequence conservation in the coding and 3' untranslated regions of the gene. Downstream sequences exhibit little homology though conserved regions are found 250 base pairs 3' to the gene. There are four regions upstream of the gene that are highly conserved in the six clones, including the canonical Goldberg - Hogness box which is 45 base pairs 5' to the coding region. A second homologous region is found 90 bases upstream. Although in the same approximate location as the CAAT box found upstream of other genes, it does not contain the canonical CAAT sequence. Further upstream of the protamine genes at -115 there is an A-T rich sequence while a 25 base pair conserved sequence is located 150 bases upstream. In addition we report the presence of a potential Z-DNA region of predominantly A-C repeats approximately one kilobase downstream of one of the genes. Images PMID:6308564

  12. Cis- and Trans-Regulatory Mechanisms of Gene Expression in the ASJ Sensory Neuron of Caenorhabditis elegans

    PubMed Central

    González-Barrios, María; Fierro-González, Juan Carlos; Krpelanova, Eva; Mora-Lorca, José Antonio; Pedrajas, José Rafael; Peñate, Xenia; Chavez, Sebastián; Swoboda, Peter; Jansen, Gert; Miranda-Vizuete, Antonio

    2015-01-01

    The identity of a given cell type is determined by the expression of a set of genes sharing common cis-regulatory motifs and being regulated by shared transcription factors. Here, we identify cis and trans regulatory elements that drive gene expression in the bilateral sensory neuron ASJ, located in the head of the nematode Caenorhabditis elegans. For this purpose, we have dissected the promoters of the only two genes so far reported to be exclusively expressed in ASJ, trx-1 and ssu-1. We hereby identify the ASJ motif, a functional cis-regulatory bipartite promoter region composed of two individual 6 bp elements separated by a 3 bp linker. The first element is a 6 bp CG-rich sequence that presumably binds the Sp family member zinc-finger transcription factor SPTF-1. Interestingly, within the C. elegans nervous system SPTF-1 is also found to be expressed only in ASJ neurons where it regulates expression of other genes in these neurons and ASJ cell fate. The second element of the bipartite motif is a 6 bp AT-rich sequence that is predicted to potentially bind a transcription factor of the homeobox family. Together, our findings identify a specific promoter signature and SPTF-1 as a transcription factor that functions as a terminal selector gene to regulate gene expression in C. elegans ASJ sensory neurons. PMID:25769980

  13. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  14. Identification of novel regulatory factor X (RFX) target genes by comparative genomics in Drosophila species

    PubMed Central

    Laurençon, Anne; Dubruille, Raphaëlle; Efimenko, Evgeni; Grenier, Guillaume; Bissett, Ryan; Cortier, Elisabeth; Rolland, Vivien; Swoboda, Peter; Durand, Bénédicte

    2007-01-01

    Background Regulatory factor X (RFX) transcription factors play a key role in ciliary assembly in nematode, Drosophila and mouse. Using the tremendous advantages of comparative genomics in closely related species, we identified novel genes regulated by dRFX in Drosophila. Results We first demonstrate that a subset of known ciliary genes in Caenorhabditis elegans and Drosophila are regulated by dRFX and have a conserved RFX binding site (X-box) in their promoters in two highly divergent Drosophila species. We then designed an X-box consensus sequence and carried out a genome wide computer screen to identify novel genes under RFX control. We found 412 genes that share a conserved X-box upstream of the ATG in both species, with 83 genes presenting a more restricted consensus. We analyzed 25 of these 83 genes, 16 of which are indeed RFX target genes. Two of them have never been described as involved in ciliogenesis. In addition, reporter construct expression analysis revealed that three of the identified genes encode proteins specifically localized in ciliated endings of Drosophila sensory neurons. Conclusion Our X-box search strategy led to the identification of novel RFX target genes in Drosophila that are involved in sensory ciliogenesis. We also established a highly valuable Drosophila cilia and basal body dataset. These results demonstrate the accuracy of the X-box screen and will be useful for the identification of candidate genes for human ciliopathies, as several human homologs of RFX target genes are known to be involved in diseases, such as Bardet-Biedl syndrome. PMID:17875208

  15. On the power and limits of evolutionary conservation—unraveling bacterial gene regulatory networks

    PubMed Central

    Baumbach, Jan

    2010-01-01

    The National Center for Biotechnology Information (NCBI) recently announced ‘1000 prokaryotic genomes are now completed and available in the Genome database’. The increasing trend will provide us with thousands of sequenced microbial organisms over the next years. However, this is only the first step in understanding how cells survive, reproduce and adapt their behavior while being exposed to changing environmental conditions. One major control mechanism is transcriptional gene regulation. Here, striking is the direct juxtaposition of the handful of bacterial model organisms to the 1000 prokaryotic genomes. Next-generation sequencing technologies will further widen this gap drastically. However, several computational approaches have proven to be helpful. The main idea is to use the known transcriptional regulatory network of reference organisms as template in order to unravel evolutionarily conserved gene regulations in newly sequenced species. This transfer essentially depends on the reliable identification of several types of conserved DNA sequences. We decompose this problem into three computational processes, review the state of the art and illustrate future perspectives. PMID:20699275

  16. Optimization of gene sequences under constant mutational pressure and selection

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Gierlik, A.; Mackiewicz, P.; Cebrat, S.; Dudek, M. R.

    1999-12-01

    We have analyzed the influence of constant mutational pressure and selection on the nucleotide composition of DNA sequences of various size, which were represented by the genes of the Borrelia burgdorferi genome. With the help of MC simulations we have found that longer DNA sequences accumulate much less base substitutions per sequence length than short sequences. This leads us to the conclusion that the accuracy of replication may determine the size of genome.

  17. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  18. Identification of novel putative regulatory genes induced during alfalfa nodule development with a cold-plaque screening procedure.

    PubMed

    Frugier, F; Kondorosi, A; Crespi, M

    1998-05-01

    Until now very few plant genes with possible regulatory functions during nodule development have been isolated. We have used a modified cold-plaque screening method to identify new transcripts expressed at low levels that are induced during nodulation. Several clones were isolated and characterized by their mRNA expression patterns during nodule development and in spontaneous nodules. Sequence homology with known genes of other organisms indicated that transcripts corresponded to (i) "basic" genes probably required during the growth of the nodule organ (e.g., structural proteins), (ii) genes related to the metabolic adaptations taking place during nodule morphogenesis and function (e.g., carbonic anhydrase), and (iii) genes containing regulatory motifs and/or homologies (three clones out of the 20 identified). The latter genes encode a zinc-finger-containing protein, a putative protein kinase, and a Wilm's tumor (WT) suppressor homologue, respectively. Expression of the kinase and WT suppressor homologues was induced early in nodulation, although the latter was activated transiently. Accumulation of the Zn-finger gene transcripts was detected at a later stage of development and seems to be regulated in a complex manner. Hence, using a cold-plaque screening procedure, we could identify genes that may play regulatory roles in the signal transduction pathways activated during nodule development. PMID:9574504

  19. Molecular structure of uvrC gene of Escherichia coli: identification of DNA sequences required for transcription of the uvrC gene.

    PubMed Central

    Sharma, S; Dowhan, W; Moses, R E

    1982-01-01

    We have carried out experiments to identify the regulatory regions of the uvrC gene of Escherichia coli. A uvrC+ plasmid, pUV7, containing the intact transcriptional unit for the uvrC gene, was used to subclone either the structural gene or combinations of the structural gene and 5'-flanking sequences. The plasmids so constructed were tested for ability to restore UV-resistant phenotype to uvrC- cells as an indication of expression of the uvrC gene. The chromosomal DNA in plasmid pUV7 was probed for strong binding with E. coli RNA polymerase in an attempt to identify a restriction fragment which bears the regulatory sequences for the uvrC transcriptional unit. The results indicate that DNA sequences at least 0.9 Kb upstream from the structural gene, but not the 5'-proximal sequences, regulate expression of the uvrC gene. Analysis of protein synthesis encoded by plasmid pUV7 and its derivatives suggest that there may be another gene that lies between the promoter and the uvrC gene and codes for a 27,000-Mr protein. The relation of this gene to uvrC function is not clear. Images PMID:6292835

  20. Evolution of regulatory genes governing biodegradation in acinetobacter calcoaceticus. Final report, 15 July 1991-31 December 1994

    SciTech Connect

    Ornston, L.N.

    1995-02-22

    The Acinetobacter calcoaceticus pca-qui-pob supraoperonic gene cluster encodes bacterial enzymes that metabolize aromatic and hydroaromatic compounds in the environment. Our investigation is directed to understanding how mutation, gene rearrangement and selection contributed to evolution of the transcriptional controls exercised over genes in the cluster. The complete nucleotide sequence of the 18 kbp gene cluster has been determined, and genetic manipulations have been used to explore mechanisms contributing to expression of the genes. The results reveal that structural gene expression is governed by complex interactions between the products of different regulatory genes some of which share common ancestry. Additional controls appear to be exercised by compartmentation of some catabolic enzymes outside the inner cell membrane. Recombination appears to have made a major contribution to the evolution of existing control mechanisms, and their maintenance may be influence by continuing recombination. Contributions of recombination to mutation and repair are under investigation as are specific molecular mechanisms underlying transcriptional controls.

  1. Acinetobacter cyclohexanone monooxygenase: gene cloning and sequence determination.

    PubMed Central

    Chen, Y C; Peoples, O P; Walsh, C T

    1988-01-01

    The gene coding for cyclohexanone monooxygenase from Acinetobacter sp. strain NCIB 9871 was isolated by immunological screening methods. We located and determined the nucleotide sequence of the gene. The structural gene is 1,626 nucleotides long and codes for a polypeptide of 542 amino acids; 389 nucleotides 5' and 108 nucleotides 3' of the coding region are also reported. The complete amino acid sequence of the enzyme was derived by translation of the nucleotide sequence. From a comparison of the amino acid sequence with consensus sequences of nucleotide-binding folds, we identified a potential flavin-binding site at the NH2 terminus of the enzyme (residues 6 to 18) and a potential nicotinamide-binding site extending from residue 176 to residue 208 of the protein. An overproduction system for the gene to facilitate genetic manipulations was also constructed by using the tac promoter vector pKK223-3 in Escherichia coli. Images PMID:3338974

  2. Complex Dynamic Behavior in Simple Gene Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Santillán Zerón, Moisés

    2007-02-01

    Knowing the complete genome of a given species is just a piece of the puzzle. To fully unveil the systems behavior of an organism, an organ, or even a single cell, we need to understand the underlying gene regulatory dynamics. Given the complexity of the whole system, the ultimate goal is unattainable for the moment. But perhaps, by analyzing the most simple genetic systems, we may be able to develop the mathematical techniques and procedures required to tackle more complex genetic networks in the near future. In the present work, the techniques for developing mathematical models of simple bacterial gene networks, like the tryptophan and lactose operons are introduced. Despite all of the underlying assumptions, such models can provide valuable information regarding gene regulation dynamics. Here, we pay special attention to robustness as an emergent property. These notes are organized as follows. In the first section, the long historical relation between mathematics, physics, and biology is briefly reviewed. Recently, the multidisciplinary work in biology has received great attention in the form of systems biology. The main concepts of this novel science are discussed in the second section. A very slim introduction to the essential concepts of molecular biology is given in the third section. In the fourth section, a brief introduction to chemical kinetics is presented. Finally, in the fifth section, a mathematical model for the lactose operon is developed and analyzed..

  3. Novel phytochrome sequences in Arabidopsis thaliana: Structure, evolution, and differential expression of a plant regulatory photoreceptor family

    SciTech Connect

    Sharrock, R.A.; Quail, P.H. )

    1989-01-01

    Phytochrome is a plant regulatory photoreceptor that mediates red light effects on a wide variety of physiological and molecular responses. DNA blot analysis indicates that the Arabidopsis thaliana genome contains four to five phytochrome-related gene sequences. The authors have isolated and sequenced cDNA clones corresponding to three of these genes and have deduced the amino acid sequence of the full-length polypeptide encoded in each case. One of these proteins (phyA) shows 65-80% amino acid sequence identity with the major, etiolated-tissue phytochrome apoproteins described previously in other plant species. The other two polypeptides (phyB and phyC) are unique in that they have low sequence identity with each other, with phyA, and with all previously described phytochromes. The phyA, phyB, and phyC proteins are of similar molecular mass, have related hydropathic profiles, and contain a conserved chromophore attachment region. However, the sequence comparison data indicate that the three phy genes diverged early in plant evolution, well before the divergence of the two major groups of angiosperms, the monocots and dicots. The steady-state level of the phyA transcript is high in dark-grown A. thaliana seedlings and is down-regulated by light. In contrast, the phyB and phyC transcripts are present at lower levels and are not strongly light-regulated. These findings indicate that the red/far red light-responsive phytochrome photoreceptor system in A. thaliana, and perhaps in all higher plants, consists of a family of chromoproteins that are heterogeneous in structure and regulation.

  4. Minimal gene regulatory circuits that can count like bacteriophage lambda.

    PubMed

    Avlund, M; Dodd, Ian B; Sneppen, K; Krishna, S

    2009-12-11

    The behavior of living systems is dependent on large dynamical gene regulatory networks (GRNs). However, the functioning of even the smallest GRNs is difficult to predict. The bistable GRN of bacteriophage lambda is able to count to make a decision between lysis and lysogeny on the basis of the number of phages infecting the cell, even though replication of the phage genome eliminates this initial difference. By simulating the behavior of a large number of random transcriptional GRNs, we show that a surprising variety of GRNs can carry out this complex task, including simple CI-Cro-like mutual repression networks. Thus, our study extends the repertoire of simple GRNs. Counterintuitively, the major effect of the addition of CII-like regulation, generally thought to be needed for counting by lambda, was to improve the ability of the networks to complete a simulated prophage induction. Our study suggests that additional regulatory mechanisms to decouple Cro and CII levels may exist in lambda and that infection counting could be widespread among temperate bacteriophages, many of which contain CI-Cro-like circuits. PMID:19796646

  5. Dynamic features of gene expression control by small regulatory RNAs.

    PubMed

    Mitarai, Namiko; Benjamin, Julie-Anna M; Krishna, Sandeep; Semsey, Szabolcs; Csiszovszki, Zsolt; Massé, Eric; Sneppen, Kim

    2009-06-30

    Small regulatory RNAs (sRNAs) in eukaryotes and bacteria play an important role in the regulation of gene expression either by binding to regulatory proteins or directly to target mRNAs. Two of the best-characterized bacterial sRNAs, Spot42 and RyhB, form a complementary pair with the ribosome binding region of their target mRNAs, thereby inhibiting translation or promoting mRNA degradation. To investigate the steady-state and dynamic potential of such sRNAs, we examine the 2 key parameters characterizing sRNA regulation: the capacity to overexpress the sRNA relative to its target mRNA and the speed at which the target mRNA is irreversibly inactivated. We demonstrate different methods to determine these 2 key parameters, for Spot42 and RyhB, which combine biochemical and genetic experiments with computational analysis. We have developed a mathematical model that describes the functional properties of sRNAs with various characteristic parameters. We observed that Spot42 and RyhB function in distinctive parameter regimes, which result in divergent mechanisms. PMID:19541626

  6. Topological effects of data incompleteness of gene regulatory networks

    PubMed Central

    2012-01-01

    Background The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. Results In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different levels. Conclusions In doing so, we identify the most relevant factors affecting the validity of previous findings, highlighting important caveats to future prokaryotic TRNs topological analysis. PMID:22920968

  7. Single molecule targeted sequencing for cancer gene mutation detection.

    PubMed

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

    2016-01-01

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis. PMID:27193446

  8. Single molecule targeted sequencing for cancer gene mutation detection

    PubMed Central

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W.; He, Jiankui

    2016-01-01

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis. PMID:27193446

  9. Eric Davidson: Steps to a gene regulatory network for development.

    PubMed

    Rothenberg, Ellen V

    2016-04-15

    Eric Harris Davidson was a unique and creative intellectual force who grappled with the diversity of developmental processes used by animal embryos and wrestled them into an intelligible set of principles, then spent his life translating these process elements into molecularly definable terms through the architecture of gene regulatory networks. He took speculative risks in his theoretical writing but ran a highly organized, rigorous experimental program that yielded an unprecedentedly full characterization of a developing organism. His writings created logical order and a framework for mechanism from the complex phenomena at the heart of advanced multicellular organism development. This is a reminiscence of intellectual currents in his work as observed by the author through the last 30-35 years of Davidson's life. PMID:26825392

  10. Putative cis-Regulatory Elements Associated with Heat Shock Genes Activated During Excystation of Cryptosporidium parvum

    PubMed Central

    Lara, Ana M.; Serrano, Myrna; Sheth, Nihar; Buck, Gregory

    2010-01-01

    Background Cryptosporidiosis is a ubiquitous infectious disease, caused by the protozoan parasites Cryptosporidium hominis and C. parvum, leading to acute, persistent and chronic diarrhea worldwide. Although the complications of this disease can be serious, even fatal, in immunocompromised patients of any age, they have also been found to lead to long term effects, including growth inhibition and impaired cognitive development, in infected immunocompetent children. The Cryptosporidium life cycle alternates between a dormant stage, the oocyst, and a highly replicative phase that includes both asexual vegetative stages as well as sexual stages, implying fine genetic regulatory mechanisms. The parasite is extremely difficult to study because it cannot be cultured in vitro and animal models are equally challenging. The recent publication of the genome sequence of C. hominis and C. parvum has, however, significantly advanced our understanding of the biology and pathogenesis of this parasite. Methodology/Principal Findings Herein, our goal was to identify cis-regulatory elements associated with heat shock response in Cryptosporidium using a combination of in silico and real time RT-PCR strategies. Analysis with Gibbs-Sampling algorithms of upstream non-translated regions of twelve genes annotated as heat shock proteins in the Cryptosporidium genome identified a highly conserved over-represented sequence motif in eleven of them. RT-PCR analyses, described herein and also by others, show that these eleven genes bearing the putative element are induced concurrent with excystation of parasite oocysts via heat shock. Conclusions/Significance Our analyses suggest that occurrences of a motif identified in the upstream regions of the Cryptosporidium heat shock genes represent parts of the transcriptional apparatus and function as stress response elements that activate expression of these genes during excystation, and possibly at other stages in the life cycle of the parasite

  11. Regulatory elements in the first intron contribute to transcriptional control of the human. cap alpha. 1(I) collagen gene

    SciTech Connect

    Bornstein, P.; McKay, J.; Morishima, J.K.; Devarayalu, S.; Gelinas, R.E.

    1987-12-01

    Several lines of evidence have suggested that the regulation of type I collagen gene transcription is complex and that important regulatory elements reside 5' to, and within, the first intron of the ..cap alpha..1(I) gene. The authors therefore sequenced a 2.3-kilobase HindIII fragment that encompasses 804 base pairs of 5' flanking sequence, the first exon, and most of the first intron of the ..cap alpha..1(I) human collagen gene. A 274-base-pair intronic sequence, flanked by Ava I sites (A274), contained a sequence identical to a high-affinity decanucleotide binding site for transcription factor Sp1 and a viral core enhancer sequence. DNase I protection experiments indicated zones of protection that corresponded to these motifs. When A274 was cloned 5' to the chloramphenicol acetyltransferase (CAT) gene, driven by an ..cap alpha..1(I) collagen promoter sequence, and expression was assessed by transfection, significant orientation-specific inhibition of CAT activity was observed. This effect was most apparent in chicken tendon fibroblasts, which modulate their level of collagen synthesis in culture. They propose that normal regulation of ..cap alpha..1(I) collagen gene transcription results from an interplay of positive and negative elements present in the promoter region and within the first intron.

  12. The analysis of Gene Regulatory Networks in plant evo-devo.

    PubMed

    Vialette-Guiraud, Aurélie C M; Andres-Robin, Amélie; Chambrier, Pierre; Tavares, Raquel; Scutt, Charles P

    2016-04-01

    We provide an overview of methods and workflows that can be used to investigate the topologies of Gene Regulatory Networks (GRNs) in the context of plant evolutionary-developmental (evo-devo) biology. Many of the species that occupy key positions in plant phylogeny are poorly adapted as laboratory models and so we focus here on techniques that can be efficiently applied to both model and non-model species of interest to plant evo-devo. We outline methods that can be used to describe gene expression patterns and also to elucidate the transcriptional, post-transcriptional, and epigenetic regulatory mechanisms underlying these patterns, in any plant species with a sequenced genome. We furthermore describe how the technique of Protein Resurrection can be used to confirm inferences on ancestral GRNs and also to provide otherwise-inaccessible points of reference in evolutionary histories by exploiting paralogues generated in gene and whole genome duplication events. Finally, we argue for the better integration of molecular data with information from paleobotanical, paleoecological, and paleogeographical studies to provide the fullest possible picture of the processes that have shaped the evolution of plant development. PMID:27006484

  13. A compilation of composite regulatory elements affecting gene transcription in vertebrates.

    PubMed Central

    Kel, O V; Romaschenko, A G; Kel, A E; Wingender, E; Kolchanov, N A

    1995-01-01

    Over the past years, evidence has been accumulating for a fundamental role of protein-protein interactions between transcription factors in gene-specific transcription regulation. Many of these interactions run within composite elements containing binding sites for several factors. We have selected 101 composite regulatory elements identified experimentally in the regulatory regions of 64 genes of vertebrates and of their viruses and briefly described them in a compilation. Of these, 82 composite elements are of the synergistic type and 19 of the antagonistic type. Within the synergistic type composite elements, transcription factors bind to the corresponding sites simultaneously, thus cooperatively activating transcription. The factors, binding to their target sites within antagonistic type composite elements, produce opposing effects on transcription. The nucleotide sequence and localization in the genes, the names and brief description of transcription factors, are provided for each composite element, including a representation of experimental data on its functioning. Most of the composite elements (3/4) fall between -250 bp and the transcription start site. The distance between the binding sites within the composite elements described varies from complete overlapping to 80 bp. The compilation of composite elements is presented in the database COMPEL which is electronically accessible by anonymous ftp via internet. PMID:7479071

  14. Genome-Wide Analysis of Wilms' Tumor 1-Controlled Gene Expression in Podocytes Reveals Key Regulatory Mechanisms.

    PubMed

    Kann, Martin; Ettou, Sandrine; Jung, Youngsook L; Lenz, Maximilian O; Taglienti, Mary E; Park, Peter J; Schermer, Bernhard; Benzing, Thomas; Kreidberg, Jordan A

    2015-09-01

    The transcription factor Wilms' tumor suppressor 1 (WT1) is key to podocyte development and viability; however, WT1 transcriptional networks in podocytes remain elusive. We provide a comprehensive analysis of the genome-wide WT1 transcriptional network in podocytes in vivo using chromatin immunoprecipitation followed by sequencing (ChIPseq) and RNA sequencing techniques. Our data show a specific role for WT1 in regulating the podocyte-specific transcriptome through binding to both promoters and enhancers of target genes. Furthermore, we inferred a podocyte transcription factor network consisting of WT1, LMX1B, TCF21, Fox-class and TEAD family transcription factors, and MAFB that uses tissue-specific enhancers to control podocyte gene expression. In addition to previously described WT1-dependent target genes, ChIPseq identified novel WT1-dependent signaling systems. These targets included components of the Hippo signaling system, underscoring the power of genome-wide transcriptional-network analyses. Together, our data elucidate a comprehensive gene regulatory network in podocytes suggesting that WT1 gene regulatory function and podocyte cell-type specification can best be understood in the context of transcription factor-regulatory element network interplay. PMID:25636411

  15. In silico analysis of the regulatory region of the Yellowtail Kingfish and Zebrafish Kiss and Kiss receptor genes.

    PubMed

    Nocillado, J N; Mechaly, A S; Elizur, A

    2013-02-01

    We have cloned and analysed the partial putative promoter sequences of the Yellowtail Kingfish (Seriola lalandi) Kiss2 and Kiss2r genes (380 and 420 bp, respectively). We obtained in silico 1.5 kb of the zebrafish (Danio rerio) Kiss1, Kiss2, Kiss1r and zfKiss2r sequences upstream of the putative transcriptional initiation site. Bioinformatic analysis revealed promoter regulatory elements including AP-1, Sp1, GR, ER, PR, AR, GATA-1, TTF-1, YY1 and C/EBP. These regulatory elements may mediate novel roles of the Kiss genes and their receptors in addition to their established role in reproductive function. PMID:22527613

  16. Isolation and phylogenetic footprinting analysis of the 5'-regulatory region of the floral homeotic gene OrcPI from Orchis italica (Orchidaceae).

    PubMed

    Aceto, Serena; Cantone, Carmela; Chiaiese, Pasquale; Ruotolo, Gianluca; Sica, Maria; Gaudio, Luciano

    2010-01-01

    The nucleotide sequences of regulatory elements from homologous genes can be strongly divergent. Phylogenetic footprinting, a comparative analysis of noncoding regions, can detect putative transcription factor binding sites (TFBSs) shared among the regulatory regions of 2 or more homologous genes. These conserved motifs have the potential to serve the same regulatory function in distantly related taxa. We isolated the 5'-noncoding region of the OrcPI gene, a MADS-box transcription factor involved in flower development in Orchis italica, using the thermal asymmetric interlaced polymerase chain reaction technique. This region (comprising 1352 bp) induced transient beta-glucuronidase expression in the petal tissue of white Rosa hybrida flowers and represents the 5'-regulatory sequence of the OrcPI gene. Phylogenetic footprinting analysis detected conserved regions within the 5'-regulatory sequence of OrcPI and the homologous regions of Oryza sativa, Lilium regale, and Arabidopsis thaliana. Some of these sequences are known TFBSs described in databases of plant regulatory elements. Nucleotide sequence data reported are available in the DDBJ/EMBL/GenBank databases under the following accession numbers: AF198055 promoter region of the PISTILLATA (PI) gene of A. thaliana; AB094985 cDNA of OrcPI (PI/GLOBOSA [PI/GLO] homologue) of O. italica; AB378089 5'-regulatory region of the OrcPI gene of O. italica; AP008211 putative promoter region of OSMADS2 (PI/GLO homologue) of O. sativa; AP008207 putative promoter region of OSMADS4 (PI/GLO homologue) of O. sativa; and AB158292 putative promoter region of the PI/GLO homologue of L. regale. PMID:19861638

  17. Mapping the DNA-binding domain and target sequences of the Streptomyces peucetius daunorubicin biosynthesis regulatory protein, DnrI.

    PubMed

    Sheldon, Paul J; Busarow, Sara B; Hutchinson, C Richard

    2002-04-01

    Streptomyces antibiotic regulatory proteins (SARPs) constitute a novel family of transcriptional activators that control the expression of several diverse anti-biotic biosynthetic gene clusters. The Streptomyces peucetius DnrI protein, one of only a handful of these proteins yet discovered, controls the biosynthesis of the polyketide antitumour antibiotics daunorubicin and doxorubicin. Recently, comparative analyses have revealed significant similarities among the predicted DNA-binding domains of the SARPs and the C-terminal DNA-binding domain of the OmpR family of regulatory proteins. Using the crystal structure of the OmpR-binding domain as a template, DnrI was mapped by truncation and site-directed mutagenesis. Several highly conserved residues within the N-terminus are crucial for DNA binding and protein function. Tandemly arranged heptameric imperfect repeat sequences are found within the -35 promoter regions of target genes. Substitutions for each nucleotide within the repeats of the dnrG-dpsABCD promoter were performed by site-directed mutagenesis. The mutant promoter fragments were found to have modified binding characteristics in gel mobility shift assays. The spacing between the repeat target sequences is also critical for successful occupation by DnrI and, therefore, competent transcriptional activation of the dnrG-dpsABCD operon. PMID:11972782

  18. The qa repressor gene of Neurospora crassa: wild-type and mutant nucleotide sequences.

    PubMed Central

    Huiet, L; Giles, N H

    1986-01-01

    The qa-1S gene, one of two regulatory genes in the qa gene cluster of Neurospora crassa, encodes the qa repressor. The qa-1S gene together with the qa-1F gene, which encodes the qa activator protein, control the expression of all seven qa genes, including those encoding the inducible enzymes responsible for the utilization of quinic acid as a carbon source. The nucleotide sequence of the qa-1S gene and its flanking regions has been determined. The deduced coding sequence for the qa-1S protein encodes 918 amino acids with a calculated molecular weight of 100,650 and is interrupted by a single 66-base-pair intervening sequence. Both constitutive and noninducible mutants occur in the qa-1S gene and two different mutations of each type have been cloned and sequenced. All four mutations occur within the predicted coding region of the qa-1S gene. This result strongly supports the hypothesis that the qa-1S gene encodes a repressor. All four mutations are located within codons for the last 300 amino acids of the qa-1S protein. The mutations in three of the mutants involve amino acid substitutions, while the fourth mutant, which has a constitutive phenotype, contains a frameshift mutation. The two constitutive mutations occur in the most distal region of the gene, possibly implicating the COOH-terminal region of the qa repressor in binding to its target. The two noninducible mutations occur in a region proximal to the constitutive mutations, possibly implicating this region of the qa repressor in binding the inducer. Images PMID:3010294

  19. Evolution of the CNS myelin gene regulatory program.

    PubMed

    Li, Huiliang; Richardson, William D

    2016-06-15

    Myelin is a specialized subcellular structure that evolved uniquely in vertebrates. A myelinated axon conducts action potentials many times faster than an unmyelinated axon of the same diameter; for the same conduction speed, the unmyelinated axon would need a much larger diameter and volume than its myelinated counterpart. Hence myelin speeds information transfer and saves space, allowing the evolution of a powerful yet portable brain. Myelination in the central nervous system (CNS) is controlled by a gene regulatory program that features a number of master transcriptional regulators including Olig1, Olig2 and Myrf. Olig family genes evolved from a single ancestral gene in non-chordates. Olig2, which executes multiple functions with regard to oligodendrocyte identity and development in vertebrates, might have evolved functional versatility through post-translational modification, especially phosphorylation, as illustrated by its evolutionarily conserved serine/threonine phospho-acceptor sites and its accumulation of serine residues during more recent stages of vertebrate evolution. Olig1, derived from a duplicated copy of Olig2 in early bony fish, is involved in oligodendrocyte development and is critical to remyelination in bony vertebrates, but is lost in birds. The origin of Myrf orthologs might be the result of DNA integration between an invading phage or bacterium and an early protist, producing a fusion protein capable of self-cleavage and DNA binding. Myrf seems to have adopted new functions in early vertebrates - initiation of the CNS myelination program as well as the maintenance of mature oligodendrocyte identity and myelin structure - by developing new ways to interact with DNA motifs specific to myelin genes. This article is part of a Special Issue entitled SI: Myelin Evolution. PMID:26474911

  20. A regulatory gene network related to the porcine umami taste receptor (TAS1R1/TAS1R3).

    PubMed

    Kim, J M; Ren, D; Reverter, A; Roura, E

    2016-02-01

    Taste perception plays an important role in the mediation of food choices in mammals. The first porcine taste receptor genes identified, sequenced and characterized, TAS1R1 and TAS1R3, were related to the dimeric receptor for umami taste. However, little is known about their regulatory network. The objective of this study was to unfold the genetic network involved in porcine umami taste perception. We performed a meta-analysis of 20 gene expression studies spanning 480 porcine microarray chips and screened 328 taste-related genes by selective mining steps among the available 12,320 genes. A porcine umami taste-specific regulatory network was constructed based on the normalized coexpression data of the 328 genes across 27 tissues. From the network, we revealed the 'taste module' and identified a coexpression cluster for the umami taste according to the first connector with the TAS1R1/TAS1R3 genes. Our findings identify several taste-related regulatory genes and extend previous genetic background of porcine umami taste. PMID:26554867

  1. Degenerative primer design and gene sequencing validation for select turkey genes.

    PubMed

    Hutsko, Stephanie L; Lilburn, Michael S; Wick, Macdonald

    2016-06-01

    We successfully designed and validated degenerative primers for turkey genes MUC2, RPS13, TBP and TFF2 based on chicken sequences in order to use gene transcription analysis to evaluate (quantify) the mucin transcription to probiotic supplementation in turkeys. Primers were designed for the genes MUC2, TFF2, RPS13 and TBP using a degenerative primer design method based on the available Gallus gallus sequences. All primer sets, which produced a single PCR amplicon of the expected sizes, were cloned into the TOPO(®) vector and then transformed into TOP 10(®) competent cells. Plasmid DNA isolation was performed on the TOP10(®) cell culture and sent for sequencing. Sequences were analyzed using NCBI BLAST. All genes sequenced had over 90% homology with both the chicken and predicted turkey sequences. The sequences were used to design new 100% homologous primer sets for the genes of interest. PMID:27053625

  2. Gene regulatory networks and developmental plasticity in the early sea urchin embryo: alternative deployment of the skeletogenic gene regulatory network.

    PubMed

    Ettensohn, Charles A; Kitazawa, Chisato; Cheers, Melani S; Leonard, Jennifer D; Sharma, Tara

    2007-09-01

    Cell fates in the sea urchin embryo are remarkably labile, despite the fact that maternal polarity and zygotic programs of differential gene expression pattern the embryo from the earliest stages. Recent work has focused on transcriptional gene regulatory networks (GRNs) deployed in specific embryonic territories during early development. The micromere-primary mesenchyme cell (PMC) GRN drives the development of the embryonic skeleton. Although normally deployed only by presumptive PMCs, every lineage of the early embryo has the potential to activate this pathway. Here, we focus on one striking example of regulative activation of the skeletogenic GRN; the transfating of non-skeletogenic mesoderm (NSM) cells to a PMC fate during gastrulation. We show that transfating is accompanied by the de novo expression of terminal, biomineralization-related genes in the PMC GRN, as well as genes encoding two upstream transcription factors, Lvalx1 and Lvtbr. We report that Lvalx1, a key component of the skeletogenic GRN in the PMC lineage, plays an essential role in the regulative pathway both in NSM cells and in animal blastomeres. MAPK signaling is required for the expression of Lvalx1 and downstream skeletogenic genes in NSM cells, mirroring its role in the PMC lineage. We also demonstrate that Lvalx1 regulates the signal from PMCs that normally suppresses NSM transfating. Significantly, misexpression of Lvalx1 in macromeres (the progenitors of NSM cells) is sufficient to activate the skeletogenic GRN. We suggest that NSM cells normally deploy a basal mesodermal pathway and require only an Lvalx1-mediated sub-program to express a PMC fate. Finally, we provide evidence that, in contrast to the normal pathway, activation of the skeletogenic GRN in NSM cells is independent of Lvpmar1. Our studies reveal that, although most features of the micromere-PMC GRN are recapitulated in transfating NSM cells, different inputs activate this GRN during normal and regulative development. PMID

  3. Cis-Regulatory Elements Determine Germline Specificity and Expression Level of an Isopentenyltransferase Gene in Sperm Cells of Arabidopsis.

    PubMed

    Zhang, Jinghua; Yuan, Tong; Duan, Xiaomeng; Wei, Xiaoping; Shi, Tao; Li, Jia; Russell, Scott D; Gou, Xiaoping

    2016-03-01

    Flowering plant sperm cells transcribe a divergent and complex complement of genes. To examine promoter function, we chose an isopentenyltransferase gene known as PzIPT1. This gene is highly selectively transcribed in one sperm cell morphotype of Plumbago zeylanica, which preferentially fuses with the central cell during fertilization and is thus a founding cell of the primary endosperm. In transgenic Arabidopsis (Arabidopsis thaliana), PzIPT1 promoter displays activity in both sperm cells and upon progressive promoter truncation from the 5'-end results in a progressive decrease in reporter production, consistent with occurrence of multiple enhancer sites. Cytokinin-dependent protein binding motifs are identified in the promoter sequence, which respond with stimulation by cytokinin. Expression of PzIPT1 promoter in sperm cells confers specificity independently of previously reported Germline Restrictive Silencer Factor binding sequence. Instead, a cis-acting regulatory region consisting of two duplicated 6-bp Male Gamete Selective Activation (MGSA) motifs occurs near the site of transcription initiation. Disruption of this sequence-specific site inactivates expression of a GFP reporter gene in sperm cells. Multiple copies of the MGSA motif fused with the minimal CaMV35S promoter elements confer reporter gene expression in sperm cells. Similar duplicated MGSA motifs are also identified from promoter sequences of sperm cell-expressed genes in Arabidopsis, suggesting selective activation is possibly a common mechanism for regulation of gene expression in sperm cells of flowering plants. PMID:26739233

  4. Core cell cycle regulatory genes in rice and their expression profiles across the growth zone of the leaf.

    PubMed

    Pettkó-Szandtner, A; Cserháti, M; Barrôco, R M; Hariharan, S; Dudits, D; Beemster, G T S

    2015-11-01

    Rice (Oryza sativa L.) as a model and crop plant with a sequenced genome offers an outstanding experimental system for discovering and functionally analyzing the major cell cycle control elements in a cereal species. In this study, we identified the core cell cycle genes in the rice genome through a hidden Markov model search and multiple alignments supported with the use of short protein sequence probes. In total we present 55 rice putative cell cycle genes with locus identity, chromosomal location, approximate chromosome position and EST accession number. These cell cycle genes include nine cyclin dependent-kinase (CDK) genes, 27 cyclin genes, one CKS gene, two RBR genes, nine E2F/DP/DEL genes, six KRP genes, and one WEE gene. We also provide characteristic protein sequence signatures encoded by CDK and cyclin gene variants. Promoter analysis by the FootPrinter program discovered several motifs in the regulatory region of the core cell cycle genes. As a first step towards functional characterization we performed transcript analysis by RT-PCR to determine gene specific variation in transcript levels along the rice leaves. The meristematic zone of the leaves where cells are actively dividing was identified based on kinematic analysis and flow cytometry. As expected, expression of the majority of cell cycle genes was exclusively associated with the meristematic region. However genes such as different D-type cyclins, DEL1, KRP1/3, and RBR2 were also expressed in leaf segments representing the transition zone in which cells start differentiation. PMID:26459328

  5. Coordinated regulation of biosynthetic and regulatory genes coincides with anthocyanin accumulation in developing eggplant fruit

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Violet to black pigmentation of eggplant (Solanum melongena) fruit is attributed to anthocyanin accumulation. Model systems support the interaction of biosynthetic and regulatory genes for anthocyanin biosynthesis. Anthocyanin structural gene transcription requires the expression of at least one m...

  6. rpoB Gene Sequencing for Identification of Corynebacterium Species

    PubMed Central

    Khamis, Atieh; Raoult, Didier; La Scola, Bernard

    2004-01-01

    The genus Corynebacterium is a heterogeneous group of species comprising human and animal pathogens and environmental bacteria. It is defined on the basis of several phenotypic characters and the results of DNA-DNA relatedness and, more recently, 16S rRNA gene sequencing. However, the 16S rRNA gene is not polymorphic enough to ensure reliable phylogenetic studies and needs to be completely sequenced for accurate identification. The almost complete rpoB sequences of 56 Corynebacterium species were determined by both PCR and genome walking methods. In all cases the percent similarities between different species were lower than those observed by 16S rRNA gene sequencing, even for those species with degrees of high similarity. Several clusters supported by high bootstrap values were identified. In order to propose a method for strain identification which does not require sequencing of the complete rpoB sequence (approximately 3,500 bp), we identified an area with a high degree of polymorphism, bordered by conserved sequences that can be used as universal primers for PCR amplification and sequencing. The sequence of this fragment (434 to 452 bp) allows accurate species identification and may be used in the future for routine sequence-based identification of Corynebacterium species. PMID:15364970

  7. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome

    PubMed Central

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A.

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. PMID:25324314

  8. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

    PubMed

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. PMID:25324314

  9. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events). PMID:23161689

  10. An algebra-based method for inferring gene regulatory networks

    PubMed Central

    2014-01-01

    Background The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. Results This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also

  11. ARMADA: Using motif activity dynamics to infer gene regulatory networks from gene expression data.

    PubMed

    Pemberton-Ross, Peter J; Pachkov, Mikhail; van Nimwegen, Erik

    2015-09-01

    Analysis of gene expression data remains one of the most promising avenues toward reconstructing genome-wide gene regulatory networks. However, the large dimensionality of the problem prohibits the fitting of explicit dynamical models of gene regulatory networks, whereas machine learning methods for dimensionality reduction such as clustering or principal component analysis typically fail to provide mechanistic interpretations of the reduced descriptions. To address this, we recently developed a general methodology called motif activity response analysis (MARA) that, by modeling gene expression patterns in terms of the activities of concrete regulators, accomplishes dramatic dimensionality reduction while retaining mechanistic biological interpretations of its predictions (Balwierz, 2014). Here we extend MARA by presenting ARMADA, which models the activity dynamics of regulators across a time course, and infers the causal interactions between the regulators that drive the dynamics of their activities across time. We have implemented ARMADA as part of our ISMARA webserver, ismara.unibas.ch, allowing any researcher to automatically apply it to any gene expression time course. To illustrate the method, we apply ARMADA to a time course of human umbilical vein endothelial cells treated with TNF. Remarkably, ARMADA is able to reproduce the complex observed motif activity dynamics using a relatively small set of interactions between the key regulators in this system. In addition, we show that ARMADA successfully infers many of the key regulatory interactions known to drive this inflammatory response and discuss several novel interactions that ARMADA predicts. In combination with ISMARA, ARMADA provides a powerful approach to generating plausible hypotheses for the key interactions between regulators that control gene expression in any system for which time course measurements are available. PMID:26164700

  12. Comparative genomics reveals functional transcriptional control sequences in the Prop1 gene

    PubMed Central

    Ward, Robert D.; Davis, Shannon W.; Cho, MinChul; Esposito, Constance; Lyons, Robert H.; Cheng, Jan-Fang; Rubin, Edward M.; Rhodes, Simon J.; Raetzman, Lori T.; Smith, Timothy P. L.

    2007-01-01

    Mutations in PROP1 are a common genetic cause of multiple pituitary hormone deficiency (MPHD). We used a comparative genomics approach to predict the transcriptional regulatory domains of Prop1 and tested them in cell culture and mice. A BAC transgene containing Prop1 completely rescues the Prop1 mutant phenotype, demonstrating that the regulatory elements necessary for proper PROP1 transcription are contained within the BAC. We generated DNA sequences from the PROP1 genes in lemur, pig, and five different primate species. Comparison of these with available human and mouse PROP1 sequences identified three putative regulatory sequences that are highly conserved. These are located in the PROP1 promoter proximal region, within the first intron of PROP1, and downstream of PROP1. Each of the conserved elements elicited orientation-specific enhancer activity in the context of the Drosophila alcohol dehydrogenase minimal promoter in both heterologous and pituitary-derived cells lines. The intronic element is sufficient to confer dorsal expansion of the pituitary expression domain of a transgene, suggesting that this element is important for the normal spatial expression of endogenous Prop1 during pituitary development. This study illustrates the usefulness of a comparative genomics approach in the identification of regulatory elements that may be the site of mutations responsible for some cases of MPHD. PMID:17557180

  13. Subfunctionalization of Duplicated Zebrafish pax6 Genes by cis-Regulatory Divergence

    PubMed Central

    Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-01-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3′) control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  14. Subfunctionalization of duplicated zebrafish pax6 genes by cis-regulatory divergence.

    PubMed

    Kleinjan, Dirk A; Bancewicz, Ruth M; Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-02-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3') control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  15. Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift.

    PubMed

    Laarits, T; Bordalo, P; Lemos, B

    2016-08-01

    Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. PMID:27213992

  16. A silent composite hemoglobinopathy characterized by gene sequencing.

    PubMed

    Zorai, A; Moumni, I; Benmansour, I; Chaouachi, D; Ghanem, A; Abbes, S

    2011-01-01

    We report the case of a 35-year-old Tunisian women with a chronic anemia non investigated for a long time. Laboratory analysis using advanced technology of DNA sequencing revealed a compound heterozygote for Hb O Arab and cd 39 beta degrees-thalassemia. It's the first time that such a genotype has been characterized by gene sequencing. PMID:23461145

  17. Flagellin gene sequence variation in the genus Pseudomonas.

    PubMed

    Bellingham, N F; Morgan, J A; Saunders, J R; Winstanley, C

    2001-07-01

    Flagellin gene (fliC) sequences from 18 strains of Pseudomonas sensu stricto representing 8 different species, and 9 representative fliC sequences from other members of the gamma sub-division of proteobacteria, were compared. Analysis was performed on N-terminal, C-terminal and whole fliC sequences. The fliC analyses confirmed the inferred relationship between P. mendocina, P. oleovorans and P. aeruginosa based on 16S rRNA sequence comparisons. In addition, the analyses indicated that P. putida PRS2000 was closely related to P. fluorescens SBW25 and P. fluorescens NCIMB 9046T, but suggested that P. putida PaW8 and P. putida PRS2000 were more closely related to other Pseudomonas spp. than they were to each other. There were a number of inconsistencies in inferred evolutionary relationships between strains, depending on the analysis performed. In particular, whole flagellin gene comparisons often differed from those obtained using N- and C-terminal sequences. However, there were also inconsistencies between the terminal region analyses, suggesting that phylogenetic relationships inferred on the basis of fliC sequence should be treated with caution. Although the central domain of fliC is highly variable between Pseudomonas strains, there was evidence of sequence similarities between the central domains of different Pseudomonas fliC sequences. This indicates the possibility of recombination in the central domain of fliC genes within Pseudomonas species, and between these genes and those from other bacteria. PMID:11518318

  18. Overproduction of lactimidomycin by cross-overexpression of genes encoding Streptomyces antibiotic regulatory proteins.

    PubMed

    Zhang, Bo; Yang, Dong; Yan, Yijun; Pan, Guohui; Xiang, Wensheng; Shen, Ben

    2016-03-01

    The glutarimide-containing polyketides represent a fascinating class of natural products that exhibit a multitude of biological activities. We have recently cloned and sequenced the biosynthetic gene clusters for three members of the glutarimide-containing polyketides-iso-migrastatin (iso-MGS) from Streptomyces platensis NRRL 18993, lactimidomycin (LTM) from Streptomyces amphibiosporus ATCC 53964, and cycloheximide (CHX) from Streptomyces sp. YIM56141. Comparative analysis of the three clusters identified mgsA and chxA, from the mgs and chx gene clusters, respectively, that were predicted to encode the PimR-like Streptomyces antibiotic regulatory proteins (SARPs) but failed to reveal any regulatory gene from the ltm gene cluster. Overexpression of mgsA or chxA in S. platensis NRRL 18993, Streptomyces sp. YIM56141 or SB11024, and a recombinant strain of Streptomyces coelicolor M145 carrying the intact mgs gene cluster has no significant effect on iso-MGS or CHX production, suggesting that MgsA or ChxA regulation may not be rate-limiting for iso-MGS and CHX production in these producers. In contrast, overexpression of mgsA or chxA in S. amphibiosporus ATCC 53964 resulted in a significant increase in LTM production, with LTM titer reaching 106 mg/L, which is five-fold higher than that of the wild-type strain. These results support MgsA and ChxA as members of the SARP family of positive regulators for the iso-MGS and CHX biosynthetic machinery and demonstrate the feasibility to improve glutarimide-containing polyketide production in Streptomyces strains by exploiting common regulators. PMID:26552797

  19. Regulatory region in choline acetyltransferase gene directs developmental and tissue-specific expression in transgenic mice.

    PubMed Central

    Lönnerberg, P; Lendahl, U; Funakoshi, H; Arhlund-Richter, L; Persson, H; Ibáñez, C F

    1995-01-01

    Acetylcholine, one of the main neurotransmitters in the nervous system, is synthesized by the enzyme choline acetyltransferase (ChAT; acetyl-CoA:choline O-acetyltransferase, EC 2.3.1.6). The molecular mechanisms controlling the establishment, maintenance, and plasticity of the cholinergic phenotype in vivo are largely unknown. A previous report showed that a 3800-bp, but not a 1450-bp, 5' flanking segment from the rat ChAT gene promoter directed cell type-specific expression of a reporter gene in cholinergic cells in vitro. Now we have characterized a distal regulatory region of the ChAT gene that confers cholinergic specificity on a heterologous downstream promoter in a cholinergic cell line and in transgenic mice. A 2342-bp segment from the 5' flanking region of the ChAT gene behaved as an enhancer in cholinergic cells but as a repressor in noncholinergic cells in an orientation-independent manner. Combined with a heterologous basal promoter, this fragment targeted transgene expression to several cholinergic regions of the central nervous system of transgenic mice, including basal forebrain, cortex, pons, and spinal cord. In eight independent transgenic lines, the pattern of transgene expression paralleled qualitatively and quantitatively that displayed by endogenous ChAT mRNA in various regions of the rat central nervous system. In the lumbar enlargement of the spinal cord, 85-90% of the transgene expression was targeted to the ventral part of the cord, where cholinergic alpha-motor neurons are located. Transgene expression in the spinal cord was developmentally regulated and responded to nerve injury in a similar way as the endogenous ChAT gene, indicating that the 2342-bp regulatory sequence contains elements controlling the plasticity of the cholinergic phenotype in developing and injured neurons. Images Fig. 1 Fig. 2 PMID:7732028

  20. Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.

    PubMed Central

    Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G

    1993-01-01

    The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231

  1. Structure and sequence divergence of two archaebacterial genes

    SciTech Connect

    Cue, D.; Beckler, G.S.; Reeve, J.N.; Konisky, J.

    1985-06-01

    The DNA sequences of a region that includes the hisA gene of two related methanogenic archaebacteria, Methanococcus voltae and Methanococcus vannielii, have been compared. Both organisms show a similar genome organization in this region, displaying three open reading frames (ORFs) separated by regions of very high A+T content. Two of the ORFs, including ORFHisA, show significant DNA sequence homology. As might be expected for organisms having a genome that is A+T-rich, there is a high preference for A and U as the third base in codons. A ribosome binding site, G-G-T-G, is located 6 base pairs preceding the ATG translation initiation sequence of both hisA genes. The sequences upstream of the two hisA genes show only limited sequence homology. The M. voltae intergenic region contains four tandemly arranged repetitions of an 11-base-pair sequence, whereas the M. vannielii sequence contains both direct and inverted repetitive sequences. Based on the degree of hisA sequence homology, the authors conclude that M. voltae and M. vannielii are less closely related taxonomically than are members of the enteric group of eubacteria.

  2. Mechanism of gene amplification via yeast autonomously replicating sequences.

    PubMed

    Sehgal, Shelly; Kaul, Sanjana; Dhar, M K

    2015-01-01

    The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification. PMID:25685838

  3. Mechanism of Gene Amplification via Yeast Autonomously Replicating Sequences

    PubMed Central

    Dhar, M. K.

    2015-01-01

    The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification. PMID:25685838

  4. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  5. Pilus genes of Neisseria gonorrheae: chromosomal organization and DNA sequence.

    PubMed

    Meyer, T F; Billyard, E; Haas, R; Storzbach, S; So, M

    1984-10-01

    We have mapped two regions of the Neisseria gonorrheae genome, pilE1 and pilE2, which are involved in pilus expression. When the cells are in the piliated P+ state, these two loci carry sequences necessary for pilin production. A silent locus, pilS1, also maps near pilE1 and pilE2. pilS1 contains structural gene information but lacks pilus promoter sequences. The pilus gene sequences in pilE1 and pilE2 are identical in strain MS11. PMID:6148752

  6. Sequence of the phosphothreonyl regulatory site peptide from inactive maize leaf pyruvate, orthophosphate dikinase

    SciTech Connect

    Roeske, C.A.; Kutny, R.M.; Budde, R.J.A.; Chollet, R.

    1988-05-15

    The regulatory site peptide sequence of phosphorylated inactive pyruvate, orthophosphate dikinase from maize leaf tissue was determined by automated Edman degradation analysis of /sup 32/P-labeled peptides purified by reversed-phase high performance liquid chromatography. The overlapping phosphopeptides were products of a digestion of the (..beta..-/sup 32/P)ADP-inactivated dikinase with either trypsin or Pronase E. The sequence is Thr-Glu-Arg-Gly-Gly-Met-Thr(P)-Ser-His-Ala-Ala-Val-Val-Ala-Arg. The phosphothreonine residue, which appeared as either an anomalous proline or an unidentifiable phenylthiohydantoin derivative during sequencing, was verified by two-dimensional phosphoamino acid analysis of the phosphopeptides and by resequencing the tryptic peptide after dephosphorylation with exogenous alkaline phosphatase. This sequence, starting at position 4, is completely homologous to the previously published sequence of the tryptic dodecapeptide harboring the catalytically essential (phospho)histidyl residue in the active-site domain of the dikinase from the nonphotosynthetic bacterium, Bacteroides symbiosus. These comparative results indicate that the regulatory phosphothreonine causing complete inactivation of maize leaf dikinase is separated from the critical active-site (phospho)histidine by just one intervening residue in the primary sequence.

  7. Gene identification and classification in the Synechocystis genomic sequence by recursive gene mark analysis.

    PubMed

    Hirosawa, M; Isono, K; Hayes, W; Borodovsky, M

    1997-01-01

    The GeneMark method has proven to be an efficient gene-finding tool for the analysis of prokaryotic genomic sequence data. We have developed a procedure of deriving and utilizing several GeneMark models in order to get better gene-detection performance. Upon applying this procedure to the 1.0 Mb contiguous DNA sequence of Synechocystis sp. strain PCC6803, we were able to cluster predicted genes into distinct classes and to produce the class-specific GeneMark models reflecting statistical characteristics of each gene class. One gene class apparently includes genes of exogenous origin. Using class-specific models reduces the gene under prediction error rate down to 1.7% in comparison with 8.1% reported in the previous study when only one GeneMark model was used. PMID:9522117

  8. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes. PMID:2024500

  9. Cloning and sequencing of the gene for human. beta. -casein

    SciTech Connect

    Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O. )

    1990-02-26

    Human {beta}-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on {beta}casein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic {sup 32}p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human {beta}-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human {beta}-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human {beta}-casein gene and will facilitate studies on factors affecting its expression.

  10. Assay for transposase-accessible chromatin and circularized chromosome conformation capture, two methods to explore the regulatory landscapes of genes in zebrafish.

    PubMed

    Fernández-Miñán, A; Bessa, J; Tena, J J; Gómez-Skarmeta, J L

    2016-01-01

    Accurate transcriptional control of genes is fundamental for the correct functioning of organs and developmental processes. This control depends on the interplay between the promoter of genes and other noncoding sequences, whose interaction is mediated by 3D chromatin arrangements. Thus, the detailed description of transcriptional regulatory landscapes is essential to understand the mechanisms of transcriptional regulation. However, to achieve that, two important challenges have to be faced: (1) the identification of the noncoding sequences that contribute to gene transcription and (2) the association of these sequences to the respective genes they control. In this chapter, we describe two protocols that allow overcoming these important challenges: the assay for transposase-accessible chromatin using sequencing (ATAC-seq) and circularized chromosome conformation capture (4C-seq). ATAC-seq is a very efficient technique that, using a very low number of cells as starting material, allows the identification of active chromatin regions genome wide, whereas 4C-seq detects the subset of sequences that interact specifically with the promoter of a given gene. When combined, both techniques provide a comprehensive snapshot of the regulatory landscapes of developmental genes. The protocols we present here have been optimized for teleost fish samples, zebrafish and medaka, allowing the in-depth study of transcriptional regulation in these two emerging animal models. Given the amenability and easy genetic manipulation of these two experimental systems, we anticipate that they will be important in revealing general principles of the vertebrate regulatory genome. PMID:27443938

  11. FINDING REGULATORY ELEMENTS USING JOINT LIKELIHOODS FOR SEQUENCE AND EXPRESSION PROFILE DATA.

    SciTech Connect

    IAN HOLMES, UC BERKELEY, CA, WILLIAM J. BRUNO, LANL

    2000-08-20

    A recent, popular method of finding promoter sequences is to look for conserved motifs up-stream of genes clustered on the basis of expression data. This method presupposes that the clustering is correct. Theoretically, one should be better able to find promoter sequences and create more relevant gene clusters by taking a unified approach to these two problems. We present a likelihood function for a sequence-expression model giving a joint likelihood for a promoter sequence and its corresponding expression levels. An algorithm to estimate sequence-expression model parameters using Gibbs sampling and Expectation/Maximization is described. A program, called kimono, that implements this algorithm has been developed and the source code is freely available over the internet.

  12. SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

    NASA Astrophysics Data System (ADS)

    Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

    2015-09-01

    The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.

  13. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  14. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus

    PubMed Central

    Damle, Sagar; Davidson, Eric H.

    2011-01-01

    Deployment of the gene regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an autorepressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  15. Biased distribution of DNA uptake sequences towards genome maintenance genes.

    PubMed

    Davidsen, Tonje; Rødland, Einar A; Lagesen, Karin; Seeberg, Erling; Rognes, Torbjørn; Tønjum, Tone

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H.influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions. These results imply that the high frequency of DUS in genome maintenance genes is conserved among phylogenetically divergent species and thus are of significant biological importance. Increased DUS density is expected to enhance DNA uptake and the over-representation of DUS in genome maintenance genes might reflect facilitated recovery of genome preserving functions. For example, transient and beneficial increase in genome instability can be allowed during pathogenesis simply through loss of antimutator genes, since these DUS-containing sequences will be preferentially recovered. Furthermore, uptake of such genes could provide a mechanism for facilitated recovery from DNA damage after genotoxic stress. PMID:14960717

  16. Inference of gene regulatory subnetworks from time course gene expression data

    PubMed Central

    2012-01-01

    Background Identifying gene regulatory network (GRN) from time course gene expression data has attracted more and more attentions. Due to the computational complexity, most approaches for GRN reconstruction are limited on a small number of genes and low connectivity of the underlying networks. These approaches can only identify a single network for a given set of genes. However, for a large-scale gene network, there might exist multiple potential sub-networks, in which genes are only functionally related to others in the sub-networks. Results We propose the network and community identification (NCI) method for identifying multiple subnetworks from gene expression data by incorporating community structure information into GRN inference. The proposed algorithm iteratively solves two optimization problems, and can promisingly be applied to large-scale GRNs. Furthermore, we present the efficient Block PCA method for searching communities in GRNs. Conclusions The NCI method is effective in identifying multiple subnetworks in a large-scale GRN. With the splitting algorithm, the Block PCA method shows a promosing attempt for exploring communities in a large-scale GRN. PMID:22901088

  17. Vertebrate paralogous conserved noncoding sequences may be related to gene expressions in brain.

    PubMed

    Matsunami, Masatoshi; Saitou, Naruya

    2013-01-01

    Vertebrate genomes include gene regulatory elements in protein-noncoding regions. A part of gene regulatory elements are expected to be conserved according to their functional importance, so that evolutionarily conserved noncoding sequences (CNSs) might be good candidates for those elements. In addition, paralogous CNSs, which are highly conserved among both orthologous loci and paralogous loci, have the possibility of controlling overlapping expression patterns of their adjacent paralogous protein-coding genes. The two-round whole-genome duplications (2R WGDs), which most probably occurred in the vertebrate common ancestors, generated large numbers of paralogous protein-coding genes and their regulatory elements. These events could contribute to the emergence of vertebrate features. However, the evolutionary history and influences of the 2R WGDs are still unclear, especially in noncoding regions. To address this issue, we identified paralogous CNSs. Region-focused Basic Local Alignment Search Tool (BLAST) search of each synteny block revealed 7,924 orthologous CNSs and 309 paralogous CNSs conserved among eight high-quality vertebrate genomes. Paralogous CNSs we found contained 115 previously reported ones and newly detected 194 ones. Through comparisons with VISTA Enhancer Browser and available ChIP-seq data, one-third (103) of paralogous CNSs detected in this study showed gene regulatory activity in the brain at several developmental stages. Their genomic locations are highly enriched near the transcription factor-coding regions, which are expressed in brain and neural systems. These results suggest that paralogous CNSs are conserved mainly because of maintaining gene expression in the vertebrate brain. PMID:23267051

  18. Regulatory and Structural Genes for Lysozymes of Mice

    PubMed Central

    Hammer, Michael F.; Wilson, Allan C.

    1987-01-01

    The molecular and genetic basis of large differences in the concentration of P lysozyme in the small intestine has been investigated by crossing inbred strains of two species of house mouse (genus Mus). The concentration of P in domesticus is about 130-fold higher than in castaneus . An autosomal genetic element determining the concentration of P has been identified and named the P lysozyme regulator, Lzp-r . The level of P in interspecific hybrids (domesticus x castaneus) as well as in certain classes of backcross progeny is intermediate relative to parental levels, which shows that the two alleles of Lzp-r are inherited additively. There are two forms of P lysozyme in the intestine of the interspecific hybrid—one having the heat stability of domesticus P, the other being more stable and presumably the product of the castaneus P locus. These two forms occur in equal amounts, and it appears that Lzp-r acts in trans. The linkage of Lzp-r to three structural genes (Lzp-s, Lzm-s1, and Lzm-s2), one specifying P lysozyme and two specifying M lysozymes, was shown by electrophoretic analysis of backcrosses involving domesticus and castaneus and also domesticus and spretus . The role of regulatory mutations in evolution is discussed in light of these results. PMID:3569879

  19. Close sequence comparisons are sufficient to identify human cis-regulatory elements.

    PubMed

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M; Couronne, Olivier; Pennacchio, Len A

    2006-07-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points. PMID:16769978

  20. Genomic structure and complete nucleotide sequence of the Batten disease gene, CLN3

    SciTech Connect

    Mitchison, H.M.; Munroe, P.B.; O`Rawe, A.M.

    1997-03-01

    We recently cloned a cDNA for CLN3, the gene for juvenile-onset neuronal ceroid lipofuscinosis or Batten disease. To resolve the genomic organization we used a cosmid clone containing CLN3 to sequence the entire gene in addition to 1.1 kb 5{prime} of the start of the published CLN3 cDNA and 0.3 kb 3{prime} to the polyadenylation site. CLN3 is organized into at least 15 exons spanning 15 kb and ranging from 47 to 356 bp. The 14 introns vary from 80 to 4227 bp, and all exon/intron junction sequences conform to the GTAG rule. Numerous repetitive Alu elements are present within the introns and 5{prime}- and 3{prime}-untranslated regions. The 5{prime} region of the CLN3 gene contains several potential transcription regulatory elements but no consensus TATA-1 box was identified. CLN3 is homologous to 27 deposited human ESTs, and sequence comparisons suggest alternative splicing of the gene and the existence of transcribed sequences upstream to the start of the published CLN3 cDNA. 19 refs., 2 figs., 1 tab.

  1. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  2. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    PubMed Central

    Johler, Sophia; Sihto, Henna-Maria; Macori, Guerrino; Stephan, Roger

    2016-01-01

    Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences. PMID:27258311

  3. cis-Regulatory control of the initial neurogenic pattern of onecut gene expression in the sea urchin embryo.

    PubMed

    Barsi, Julius C; Davidson, Eric H

    2016-01-01

    Specification of the ciliated band (CB) of echinoid embryos executes three spatial functions essential for postgastrular organization. These are establishment of a band about 5 cells wide which delimits and bounds other embryonic territories; definition of a neurogenic domain within this band; and generation within it of arrays of ciliary cells that bear the special long cilia from which the structure derives its name. In Strongylocentrotus purpuratus the spatial coordinates of the future ciliated band are initially and exactly determined by the disposition of a ring of cells that transcriptionally activate the onecut homeodomain regulatory gene, beginning in blastula stage, long before the appearance of the CB per se. Thus the cis-regulatory apparatus that governs onecut expression in the blastula directly reveals the genomic sequence code by which these aspects of the spatial organization of the embryo are initially determined. We screened the entire onecut locus and its flanking region for transcriptionally active cis-regulatory elements, and by means of BAC recombineered deletions identified three separated and required cis-regulatory modules that execute different functions. The operating logic of the crucial spatial control module accounting for the spectacularly precise and beautiful early onecut expression domain depends on spatial repression. Previously predicted oral ectoderm and aboral ectoderm repressors were identified by cis-regulatory mutation as the products of goosecoid and irxa genes respectively, while the pan-ectodermal activator SoxB1 supplies a transcriptional driver function. PMID:26522848

  4. Pi class glutathione S-transferase genes are regulated by Nrf 2 through an evolutionarily conserved regulatory element in zebrafish

    PubMed Central

    Suzuki, Takafumi; Takagi, Yaeko; Osanai, Hitoshi; Li, Li; Takeuchi, Miki; Katoh, Yasutake; Kobayashi, Makoto; Yamamoto, Masayuki

    2005-01-01

    Pi class GSTs (glutathione S-transferases) are a member of the vertebrate GST family of proteins that catalyse the conjugation of GSH to electrophilic compounds. The expression of Pi class GST genes can be induced by exposure to electrophiles. We demonstrated previously that the transcription factor Nrf 2 (NF-E2 p45-related factor 2) mediates this induction, not only in mammals, but also in fish. In the present study, we have isolated the genomic region of zebrafish containing the genes gstp1 and gstp2. The regulatory regions of zebrafish gstp1 and gstp2 have been examined by GFP (green fluorescent protein)-reporter gene analyses using microinjection into zebrafish embryos. Deletion and point-mutation analyses of the gstp1 promoter showed that an ARE (antioxidant-responsive element)-like sequence is located 50 bp upstream of the transcription initiation site which is essential for Nrf 2 transactivation. Using EMSA (electrophoretic mobility-shift assay) analysis we showed that zebrafish Nrf 2–MafK heterodimer specifically bound to this sequence. All the vertebrate Pi class GST genes harbour a similar ARE-like sequence in their promoter regions. We propose that this sequence is a conserved target site for Nrf 2 in the Pi class GST genes. PMID:15654768

  5. CTCF binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation.

    PubMed

    Plasschaert, Robert N; Vigneau, Sébastien; Tempera, Italo; Gupta, Ravi; Maksimoska, Jasna; Everett, Logan; Davuluri, Ramana; Mamorstein, Ronen; Lieberman, Paul M; Schultz, David; Hannenhalli, Sridhar; Bartolomei, Marisa S

    2014-01-01

    CTCF (CCCTC-binding factor) is a highly conserved multifunctional DNA-binding protein with thousands of binding sites genome-wide. Our previous work suggested that differences in CTCF's binding site sequence may affect the regulation of CTCF recruitment and its function. To investigate this possibility, we characterized changes in genome-wide CTCF binding and gene expression during differentiation of mouse embryonic stem cells. After separating CTCF sites into three classes (LowOc, MedOc and HighOc) based on similarity to the consensus motif, we found that developmentally regulated CTCF binding occurs preferentially at LowOc sites, which have lower similarity to the consensus. By measuring the affinity of CTCF for selected sites, we show that sites lost during differentiation are enriched in motifs associated with weaker CTCF binding in vitro. Specifically, enrichment for T at the 18(th) position of the CTCF binding site is associated with regulated binding in the LowOc class and can predictably reduce CTCF affinity for binding sites. Finally, by comparing changes in CTCF binding with changes in gene expression during differentiation, we show that LowOc and HighOc sites are associated with distinct regulatory functions. Our results suggest that the regulatory control of CTCF is dependent in part on specific motifs within its binding site. PMID:24121688

  6. CTCF binding site sequence differences are associated with unique regulatory and functional trends during embryonic stem cell differentiation

    PubMed Central

    Plasschaert, Robert N.; Vigneau, Sébastien; Tempera, Italo; Gupta, Ravi; Maksimoska, Jasna; Everett, Logan; Davuluri, Ramana; Mamorstein, Ronen; Lieberman, Paul M.; Schultz, David; Hannenhalli, Sridhar; Bartolomei, Marisa S.

    2014-01-01

    CTCF (CCCTC-binding factor) is a highly conserved multifunctional DNA-binding protein with thousands of binding sites genome-wide. Our previous work suggested that differences in CTCF’s binding site sequence may affect the regulation of CTCF recruitment and its function. To investigate this possibility, we characterized changes in genome-wide CTCF binding and gene expression during differentiation of mouse embryonic stem cells. After separating CTCF sites into three classes (LowOc, MedOc and HighOc) based on similarity to the consensus motif, we found that developmentally regulated CTCF binding occurs preferentially at LowOc sites, which have lower similarity to the consensus. By measuring the affinity of CTCF for selected sites, we show that sites lost during differentiation are enriched in motifs associated with weaker CTCF binding in vitro. Specifically, enrichment for T at the 18th position of the CTCF binding site is associated with regulated binding in the LowOc class and can predictably reduce CTCF affinity for binding sites. Finally, by comparing changes in CTCF binding with changes in gene expression during differentiation, we show that LowOc and HighOc sites are associated with distinct regulatory functions. Our results suggest that the regulatory control of CTCF is dependent in part on specific motifs within its binding site. PMID:24121688

  7. Sequence Determinants of Circadian Gene Expression Phase in Cyanobacteria

    PubMed Central

    Vijayan, Vikram

    2013-01-01

    The cyanobacterium Synechococcus elongatus PCC 7942 exhibits global biphasic circadian oscillations in gene expression under constant-light conditions. Class I genes are maximally expressed in the subjective dusk, whereas class II genes are maximally expressed in the subjective dawn. Here, we identify sequence features that encode the phase of circadian gene expression. We find that, for multiple genes, an ∼70-nucleotide promoter fragment is sufficient to specify class I or II phase. We demonstrate that the gene expression phase can be changed by random mutagenesis and that a single-nucleotide substitution is sufficient to change the phase. Our study provides insight into how the gene expression phase is encoded in the cyanobacterial genome. PMID:23204469

  8. Gene Discovery through Genomic Sequencing of Brucella abortus

    PubMed Central

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  9. Sequence analysis of two genomic regions containing the KIT and the FMS receptor tyrosine kinase genes

    SciTech Connect

    Andre, C.; Hampe, A.; Lachaume, P.

    1997-01-15

    The KIT and FMS tyrosine kinase receptors, which are implicated in the control of cell growth and differentiation, stem through duplications from a common ancestor. We have conducted a detailed structural analysis of the two loci containing the KIT and FMS genes. The sequence of the {approximately}90-kb KIT locus reveals the position and size of the 21 introns and of the 5{prime} regulatory region of the KIT gene. The introns and the 3{prime}-untranslated parts of KIT and FMS have been analyzed in parallel. Comparison of the two sequences shows that, while introns of both genes have extensively diverged in size and sequence, this divergence is, at least in part, due to intron expansion through internal duplications, as suggested by the discrete extant analogies. Repetitive elements as well as exon predictions obtained using the GRAIL and GENEFINDER programs are described in detail. These programs led us to identify a novel gene, designated SMF, immediately downstream of FMS, in the opposite orientation. This finding emphasizes the gene-rich characteristic of this genomic region. 49 refs., 4 figs., 7 tabs.

  10. Existence of a True Phosphofructokinase in Bacillus sphaericus: Cloning and Sequencing of the pfk Gene

    PubMed Central

    Alice, Alejandro F.; Pérez-Martínez, Gaspar; Sánchez-Rivas, Carmen

    2002-01-01

    Some strains of Bacillus sphaericus are entomopathogenic to mosquito larvae, which transmit diseases, such as filariasis and malaria, affecting millions of people worldwide. This species is unable to use hexoses and pentoses as unique carbon sources, which was proposed to be due to the lack of glycolytic enzymes, such as 6-phosphofructokinase (PFK). In this study, PFK activity was detected and the pfk gene was cloned and sequenced. Furthermore, this gene was shown to be present in strains belonging to all the homology groups of this heterogeneous species, in which PFK activity was also detected. A careful sequence analysis revealed the conservation of different catalytic and regulatory residues, as well as the enzyme's phylogenetic affiliation with the family of allosteric ATP-PFK enzymes. PMID:12450869

  11. Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

    PubMed Central

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

    2013-01-01

    For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960

  12. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model.

    PubMed

    Smith, Robin P; Taher, Leila; Patwardhan, Rupali P; Kim, Mee J; Inoue, Fumitaka; Shendure, Jay; Ovcharenko, Ivan; Ahituv, Nadav

    2013-09-01

    Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers. PMID:23892608

  13. Identifying Functional Gene Regulatory Network Phenotypes Underlying Single Cell Transcriptional Variability

    PubMed Central

    Park, James; Ogunnaike, Babatunde; Schwaber, James; Vadigepalli, Rajanikanth

    2014-01-01

    Summary/abstract Recent analysis of single-cell transcriptomic data has revealed a surprising organization of the transcriptional variability pervasive across individual neurons. In response to distinct combinations of synaptic input-type, a new organization of neuronal subtypes emerged based on transcriptional states that were aligned along a gradient of correlated gene expression. Individual neurons traverse across these transcriptional states in response to cellular inputs. However, the regulatory network interactions driving these changes remain unclear. Here we present a novel fuzzy logic-based approach to infer quantitative gene regulatory network models from highly variable single-cell gene expression data. Our approach involves developing an a priori regulatory network that is then trained against in vivo single-cell gene expression data in order to identify causal gene interactions and corresponding quantitative model parameters. Simulations of the inferred gene regulatory network response to experimentally observed stimuli levels mirrored the pattern and quantitative range of gene expression across individual neurons remarkably well. In addition, the network identification results revealed that distinct regulatory interactions, coupled with differences in the regulatory network stimuli, drive the variable gene expression patterns observed across the neuronal subtypes. We also identified a key difference between the neuronal subtype-specific networks with respect to negative feedback regulation, with the catecholaminergic subtype network lacking such interactions. Furthermore, by varying regulatory network stimuli over a wide range, we identified several cases in which divergent neuronal subtypes could be driven towards similar transcriptional states by distinct stimuli operating on subtype-specific regulatory networks. Based on these results, we conclude that heterogeneous single-cell gene expression profiles should be interpreted through a regulatory

  14. Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids

    PubMed Central

    Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

    2015-01-01

    Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221

  15. The nucleotide sequence of the bacteriophage T5 ltf gene.

    PubMed

    Kaliman, A V; Kulshin, V E; Shlyapnikov, M G; Ksenzenko, V N; Kryukov, V M

    1995-06-01

    The nucleotide sequence of the bacteriophage T5 Bg/II-BamHI fragment (4,835 bp in length) known to carry a gene encoding the LTF protein which forms the phage L-shaped tail fibers was determined. It was shown to contain an open reading frame for 1,396 amino acid residues that corresponds to a protein of 147.8 kDa. The coding region of ltf gene is preceded by a typical Shine-Dalgarno sequence. Downstream from the ltf gene there is a strong transcription terminator. Data bank analysis of the LTF protein sequence reveals 55.1% identity to the hypothetical protein ORF 401 of bacteriophage lambda in a segment of 118 amino acids overlap. PMID:7789514

  16. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  17. Identification of cis-acting repressive sequences within the negative regulatory element of human immunodeficiency virus type 1.

    PubMed Central

    Lu, Y C; Touzjian, N; Stenzel, M; Dorfman, T; Sodroski, J G; Haseltine, W A

    1990-01-01

    The negative regulatory element of human immunodeficiency virus type 1 is a 260-nucleotide-long sequence that decreases the rate of RNA transcription initiation specified by the long terminal repeat. This region has the potential to bind several cellular transcription factors. Here it is shown that sequences which recognize the NFAT-1 and USF cellular transcription factors contribute to this negative regulatory effect. The sequences within the negative regulatory element which resemble the AP-1 site and the URS do not negatively regulate human immunodeficiency virus long terminal repeat transcription initiation. PMID:2398545

  18. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  19. Sequence and organization of pXO1, the large Bacillus anthracis plasmid harboring the anthrax toxin genes.

    PubMed

    Okinaka, R T; Cloud, K; Hampton, O; Hoffmaster, A R; Hill, K K; Keim, P; Koehler, T M; Lamke, G; Kumano, S; Mahillon, J; Manter, D; Martinez, Y; Ricke, D; Svensson, R; Jackson, P J

    1999-10-01

    The Bacillus anthracis Sterne plasmid pXO1 was sequenced by random, "shotgun" cloning. A circular sequence of 181,654 bp was generated. One hundred forty-three open reading frames (ORFs) were predicted using GeneMark and GeneMark.hmm, comprising only 61% (110,817 bp) of the pXO1 DNA sequence. The overall guanine-plus-cytosine content of the plasmid is 32.5%. The most recognizable feature of the plasmid is a "pathogenicity island," defined by a 44.8-kb region that is bordered by inverted IS1627 elements at each end. This region contains the three toxin genes (cya, lef, and pagA), regulatory elements controlling the toxin genes, three germination response genes, and 19 additional ORFs. Nearly 70% of the ORFs on pXO1 do not have significant similarity to sequences available in open databases. Absent from the pXO1 sequence are homologs to genes that are typically required to drive theta replication and to maintain stability of large plasmids in Bacillus spp. Among the ORFs with a high degree of similarity to known sequences are a collection of putative transposases, resolvases, and integrases, suggesting an evolution involving lateral movement of DNA among species. Among the remaining ORFs, there are three sequences that may encode enzymes responsible for the synthesis of a polysaccharide capsule usually associated with serotype-specific virulent streptococci. PMID:10515943

  20. Sequences contained within the promoter of the human thymidine kinase gene can direct cell-cycle regulation of heterologous fusion genes.

    PubMed Central

    Kim, Y K; Wells, S; Lau, Y F; Lee, A S

    1988-01-01

    Recent evidence on the transcriptional regulation of the human thymidine kinase (TK) gene raises the possibility that cell-cycle regulatory sequences may be localized within its promoter. A hybrid gene that combines the TK 5' flanking sequence and the coding region of the bacterial neomycin-resistance gene (neo) has been constructed. Upon transfection into a hamster fibroblast cell line K12, the hybrid gene exhibits cell-cycle-dependent expression. Deletion analysis reveals that the region important for cell-cycle regulation is within -441 to -63 nucleotides from the transcriptional initiation site. This region (-441 to -63) also confers cell-cycle regulation to the herpes simplex virus thymidine kinase (HSVtk) promoter, which is not expressed in a cell-cycle manner. We conclude that the -441 to -63 sequence within the human TK promoter is important for cell-cycle-dependent expression. Images PMID:3413063

  1. Sequence heterogeneity and differential expression of the alpha-Amy2 gene family in wheat.

    PubMed

    Huttly, A K; Martienssen, R A; Baulcombe, D C

    1988-10-01

    The alpha-Amy2 genes of wheat are a multigene family which is expressed in the aleurone cells of germinating grain under control of the plant hormone gibberellin. A subset of the genes are also expressed in developing grain. Comparison of five genomic clones containing alpha-Amy2 genes, using DNA sequence analysis and Southern hybridisation, showed that the extent of similarity between genes differed. Two of the most heterogeneous genes compared were located to the same group 7 chromosome while the most similar genes alpha-Amy2/54 and alpha-Amy2/8 were located to different ones; hence sequence variation could not be correlated to the ancestry of the alpha-Amy2 genes during the separate existence of the constituent genomes of hexaploid wheat. Expression of the cloned genes was measured using an S1 nuclease protection assay and this identified alpha-Amy2/54 and alpha-Amy2/8 as part of the subset of alpha-Amy2 genes expressed in both the developing grain and in aleurone cells. Comparison of the 5' upstream regions of all five genes showed high similarity, with the exception of one gene, up to -280 nucleotides from the transcriptional start, while similarity between alpha-Amy2/54 and alpha-Amy2/8 extended a further 90 bp upstream of this point. It is suggested that regulatory elements responsible for tissue specificity and gibberellin regulation may be located within these regions of similarity. PMID:2467183

  2. Natural Selection on Coding and Noncoding DNA Sequences Is Associated with Virulence Genes in a Plant Pathogenic Fungus

    PubMed Central

    Rech, Gabriel E.; Sanz-Martín, José M.; Anisimova, Maria; Sukno, Serenella A.; Thon, Michael R.

    2014-01-01

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5′ untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen. PMID:25193312

  3. Molecular cloning, sequencing analysis, and chromosomal localization of the human protease inhibitor 4 (Kallistatin) gene (P14)

    SciTech Connect

    Chai, K.X.; Chao, J.; Chao, L.; Ward, D.C.

    1994-09-15

    The gene encoding human protease inhibitor 4 (kallistatin; gene symbol PI4), a novel serine proteinase inhibitor (serpin), has been isolated and completely sequenced. The kallistatin gene is 9618 bp in length and contains five exons and four introns. The structure and organization of the kallistatin gene are similar to those of the genes encoding {alpha}{sub 1}-antichymotrypsin. The kallistatin gene is also similar to the genes encoding rat and mouse kallikrein-binding proteins. The first exon of the kallistatin gene is a noncoding 89-bp fragment, as determined by primer extension. The fifth exon, which contains 308 bp of noncoding sequence, encodes the reactive center of kallistatin. In the 5`-flanking region of the kallistatin gene, 1125 bp have been sequenced and a consensus promoter segment with potential transcription regulatory sites, including CAAT and TATA boxes, an AP-2 binding site, a GC-rich region, a cAMP response element, and an AP-1 binding site, has been identified within this region. The kallistatin gene was localized by in situ hybridization to human chromosome 14q31-132.1, close to the serpin genes encoding {alpha}{sub 1}-antichymotrypsin, protein C inhibitor, {alpha}{sub 1}-antitrypsin, and corticosteroid-binding globulin. In a genomic DNA Southern blot, kallistatin-related genes were identified in monkey, mouse, rat, bovine, dog, cat, and a ground mole. The patterns of hybridization revealed clues of human serpin evolution. 34 refs., 6 figs.

  4. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  5. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants.

    PubMed

    Van de Velde, Jan; Van Bel, Michiel; Vaneechoutte, Dries; Vandepoele, Klaas

    2016-08-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  6. Heme binds to a short sequence that serves a regulatory function in diverse proteins.

    PubMed Central

    Zhang, L; Guarente, L

    1995-01-01

    Heme is a prosthetic group for numerous enzymes, cytochromes and globins, and it binds tightly, sometimes covalently, to these proteins. Interestingly, heme also potentiates binding of the yeast transcriptional activator HAP1 to DNA and inhibits mitochondrial import of the mammalian delta-aminolevulinate synthase (ALAS) and the catalytic activity of the reticulocyte kinase, HRI. All three of these proteins contain a short sequence, the heme regulatory motif (HRM), that occurs six times adjacent to the HAP1 DNA binding domain, twice in the leader targeting sequence of ALAS and twice near the catalytic domain of the HRI kinase. Here we show that a 10 amino acid peptide containing the HRM consensus binds to heme in the micromolar range, and shifts the heme absorption spectrum to a longer wavelength, a direction opposite to the change caused by cytochromes or globins. Further, we show that a single HRM regulates the acidic activation domains of HAP1 and GAL4 independently of regulation of DNA binding of the transcription factors. These findings thus establish a novel heme binding sequence which is structurally distinct from sequences in globins or cytochromes and which has a regulatory function. Images PMID:7835342

  7. An integrative ChIP-chip and gene expression profiling to model SMAD regulatory modules

    PubMed Central

    Qin, Huaxia; Chan, Michael WY; Liyanarachchi, Sandya; Balch, Curtis; Potter, Dustin; Souriraj, Irene J; Cheng, Alfred SL; Agosto-Perez, Francisco J; Nikonova, Elena V; Yan, Pearlly S; Lin, Huey-Jen; Nephew, Kenneth P; Saltz, Joel H; Showe, Louise C; Huang, Tim HM; Davuluri, Ramana V

    2009-01-01

    Background The TGF-β/SMAD pathway is part of a broader signaling network in which crosstalk between pathways occurs. While the molecular mechanisms of TGF-β/SMAD signaling pathway have been studied in detail, the global networks downstream of SMAD remain largely unknown. The regulatory effect of SMAD complex likely depends on transcriptional modules, in which the SMAD binding elements and partner transcription factor binding sites (SMAD modules) are present in specific context. Results To address this question and develop a computational model for SMAD modules, we simultaneously performed chromatin immunoprecipitation followed by microarray analysis (ChIP-chip) and mRNA expression profiling to identify TGF-β/SMAD regulated and synchronously coexpressed gene sets in ovarian surface epithelium. Intersecting the ChIP-chip and gene expression data yielded 150 direct targets, of which 141 were grouped into 3 co-expressed gene sets (sustained up-regulated, transient up-regulated and down-regulated), based on their temporal changes in expression after TGF-β activation. We developed a data-mining method driven by the Random Forest algorithm to model SMAD transcriptional modules in the target sequences. The predicted SMAD modules contain SMAD binding element and up to 2 of 7 other transcription factor binding sites (E2F, P53, LEF1, ELK1, COUPTF, PAX4 and DR1). Conclusion Together, the computational results further the understanding of the interactions between SMAD and other transcription factors at specific target promoters, and provide the basis for more targeted experimental verification of the co-regulatory modules. PMID:19615063

  8. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    PubMed

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. PMID:24751285

  9. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    PubMed Central

    Santra, Tapesh

    2014-01-01

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances. PMID:25152886

  10. Vitamin C deficiency improves somatic embryo development through distinct gene regulatory networks in Arabidopsis.

    PubMed

    Becker, Michael G; Chan, Ainsley; Mao, Xingyu; Girard, Ian J; Lee, Samantha; Elhiti, Mohamed; Stasolla, Claudio; Belmonte, Mark F

    2014-11-01

    Changes in the endogenous ascorbate redox status through genetic manipulation of cellular ascorbate levels were shown to accelerate cell proliferation during the induction phase and improve maturation of somatic embryos in Arabidopsis. Mutants defective in ascorbate biosynthesis such as vtc2-5 contained ~70 % less cellular ascorbate compared with their wild-type (WT; Columbia-0) counterparts. Depletion of cellular ascorbate accelerated cell division processes and cellular reorganization and improved the number and quality of mature somatic embryos grown in culture by 6-fold compared with WT tissues. To gain insight into the molecular mechanisms underlying somatic embryogenesis (SE), we profiled dynamic changes in the transcriptome and analysed dominant patterns of gene activity in the WT and vtc2-5 lines across the somatic embryo culturing process. Our results provide insight into the gene regulatory networks controlling SE in Arabidopsis based on the association of transcription factors with DNA sequence motifs enriched in biological processes of large co-expressed gene sets. These data provide the first detailed account of temporal changes in the somatic embryo transcriptome starting with the zygotic embryo, through tissue dedifferentiation, and ending with the mature somatic embryo, and impart insight into possible mechanisms for the improved culture of somatic embryos in the vtc2-5 mutant line. PMID:25151615

  11. Vitamin C deficiency improves somatic embryo development through distinct gene regulatory networks in Arabidopsis

    PubMed Central

    Becker, Michael G.; Chan, Ainsley; Mao, Xingyu; Girard, Ian J.; Lee, Samantha; Elhiti, Mohamed; Stasolla, Claudio; Belmonte, Mark F.

    2014-01-01

    Changes in the endogenous ascorbate redox status through genetic manipulation of cellular ascorbate levels were shown to accelerate cell proliferation during the induction phase and improve maturation of somatic embryos in Arabidopsis. Mutants defective in ascorbate biosynthesis such as vtc2-5 contained ~70 % less cellular ascorbate compared with their wild-type (WT; Columbia-0) counterparts. Depletion of cellular ascorbate accelerated cell division processes and cellular reorganization and improved the number and quality of mature somatic embryos grown in culture by 6-fold compared with WT tissues. To gain insight into the molecular mechanisms underlying somatic embryogenesis (SE), we profiled dynamic changes in the transcriptome and analysed dominant patterns of gene activity in the WT and vtc2-5 lines across the somatic embryo culturing process. Our results provide insight into the gene regulatory networks controlling SE in Arabidopsis based on the association of transcription factors with DNA sequence motifs enriched in biological processes of large co-expressed gene sets. These data provide the first detailed account of temporal changes in the somatic embryo transcriptome starting with the zygotic embryo, through tissue dedifferentiation, and ending with the mature somatic embryo, and impart insight into possible mechanisms for the improved culture of somatic embryos in the vtc2-5 mutant line. PMID:25151615

  12. Targeting of AID-mediated sequence diversification to immunoglobulin genes.

    PubMed

    Kothapalli, Naga Rama; Fugmann, Sebastian D

    2011-04-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is probably a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes. PMID:21295456

  13. Targeting of AID-mediated sequence diversification to immunoglobulin genes

    PubMed Central

    Kothapalli, Naga Rama; Fugmann, Sebastian D.

    2011-01-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is likely a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes. PMID:21295456

  14. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    PubMed Central

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  15. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene.

    PubMed

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence. PMID:27193250

  16. Exploiting single-molecule transcript sequencing for eukaryotic gene prediction.

    PubMed

    Minoche, André E; Dohm, Juliane C; Schneider, Jessica; Holtgräwe, Daniela; Viehöver, Prisca; Montfort, Magda; Sörensen, Thomas Rosleff; Weisshaar, Bernd; Himmelbauer, Heinz

    2015-01-01

    We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes. PMID:26328666

  17. Cell type-selective disease-association of genes under high regulatory load.

    PubMed

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-10-15

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3' UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. PMID:26338775

  18. Cell type-selective disease-association of genes under high regulatory load

    PubMed Central

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-01-01

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3′ UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. PMID:26338775

  19. Transcriptome Analysis of an Insecticide Resistant Housefly Strain: Insights about SNPs and Regulatory Elements in Cytochrome P450 Genes

    PubMed Central

    Asp, Torben; Kristensen, Michael

    2016-01-01

    Background Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. Results The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Conclusion Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s

  20. Expression of an early gene in the flagellar regulatory hierarchy is sensitive to an interruption in DNA replication.

    PubMed Central

    Dingwall, A; Zhuang, W Y; Quon, K; Shapiro, L

    1992-01-01

    Genes involved in the biogenesis of the flagellum in Caulobacter crescentus are expressed in a temporal order and are controlled by a trans-acting regulatory hierarchy. Strains with mutations in one of these genes, flaS, cannot transcribe flagellar structural genes and divide abnormally. This gene was cloned, and it was found that its transcription is initiated early in the cell cycle. Subclones that restored motility to FlaS mutants also restored normal cell division. Although transcription of flaS was not dependent on any other known gene in the flagellar hierarchy, it was autoregulated and subject to mild negative control by other genes at the same level of the hierarchy. An additional level of control was revealed when it was found that an interruption of DNA replication caused the inhibition of flaS transcription. The flaS transcript initiation site was identified, and an apparently unique promoter sequence was found to be highly conserved among the genes at the same level of the hierarchy. The flagellar genes with this conserved 5' region all initiate transcription early in the cell cycle and are all sensitive to a disruption in DNA replication. Mutations in these genes also cause an aberrant cell division phenotype. Therefore, flagellar genes at or near the top of the hierarchy may be controlled, in part, by a unique transcription factor and may be responsive to the same DNA replication cues that mediate other cell cycle events, such as cell division. Images PMID:1372311

  1. Nucleotide sequence corresponding to five chemotaxis genes in Escherichia coli.

    PubMed Central

    Mutoh, N; Simon, M I

    1986-01-01

    The nucleotide sequence of DNA which contains five chemotaxis-related genes of Escherichia coli, cheW, cheR, cheB, cheY, and cheZ, and part of the cheA gene was determined. Molecular weights of the polypeptides encoded by these genes were calculated from translated amino acid sequences, and they were 18,100 for cheW, 32,700 for cheR, 37,500 for cheB, 14,100 for cheY, and 24,000 for cheZ. Nucleotide sequences which could act as ribosome-binding sites were found in the upstream region of each gene. After the termination codon of the cheW gene, a typical rho-independent transcription termination signal was observed. There are no other open reading frames long enough to encode polypeptides in this region except those which code for the two previously reported genes tar and tap. PMID:3510184

  2. Role of basic leucine zipper proteins in transcriptional regulation of the steroidogenic acute regulatory protein gene

    PubMed Central

    Manna, Pulak R.; Dyson, Matthew T.; Stocco, Douglas M.

    2016-01-01

    The regulation of steroidogenic acute regulatory protein (StAR) gene transcription by cAMP-dependent mechanisms occurs in the absence of a consensus cAMP response element (CRE, TGACGTGA). This regulation is coordinated by multiple transcription factors that bind to sequence-specific elements located approximately 150 bp upstream of the transcription start site. Among the proteins that bind within this region, the basic leucine zipper (bZIP) family of transcription factors, i.e. CRE binding protein (CREB)/CRE modulator (CREM)/activating transcription factor (ATF), activator protein 1 (AP-1; Fos/Jun), and CCAAT enhancer binding protein β (C/EBPβ), interact with an overlapping region (−81/−72 bp) in the StAR promoter, mediate stimulus-transcription coupling of cAMP signaling and play integral roles in regulating StAR gene expression. These bZIP proteins are structurally similar and bind to DNA sequences as dimers; however, they exhibit discrete transcriptional activities, interact with several transcription factors and other properties that contribute in their regulatory functions. The 5′-flanking −81/−72 bp region of the StAR gene appears to function as a key element within a complex cAMP response unit by binding to different bZIP members, and the StAR promoter displays variable states of cAMP responsivity contingent upon the occupancy of these cis-elements with these transcription factors. The expression and activities of CREB/CREM/ATF, Fos/Jun and C/EBPβ have been demonstrated to be mediated by a plethora of extracellular signals, and the phosphorylation of these proteins at several Ser and Thr residues allows recruitment of the transcriptional coactivator CREB binding protein (CBP) or its functional homolog p300 to the StAR promoter. This review will focus on the current level of understanding of the roles of selective bZIP family proteins within the complex series of processes involved in regulating StAR gene transcription. PMID:19150388

  3. Six homeoproteins directly activate Myod expression in the gene regulatory networks that control early myogenesis.

    PubMed

    Relaix, Frédéric; Demignon, Josiane; Laclef, Christine; Pujol, Julien; Santolini, Marc; Niro, Claire; Lagha, Mounia; Rocancourt, Didier; Buckingham, Margaret; Maire, Pascal

    2013-04-01

    In mammals, several genetic pathways have been characterized that govern engagement of multipotent embryonic progenitors into the myogenic program through the control of the key myogenic regulatory gene Myod. Here we demonstrate the involvement of Six homeoproteins. We first targeted into a Pax3 allele a sequence encoding a negative form of Six4 that binds DNA but cannot interact with essential Eya co-factors. The resulting embryos present hypoplasic skeletal muscles and impaired Myod activation in the trunk in the absence of Myf5/Mrf4. At the axial level, we further show that Myod is still expressed in compound Six1/Six4:Pax3 but not in Six1/Six4:Myf5 triple mutant embryos, demonstrating that Six1/4 participates in the Pax3-Myod genetic pathway. Myod expression and head myogenesis is preserved in Six1/Six4:Myf5 triple mutant embryos, illustrating that upstream regulators of Myod in different embryonic territories are distinct. We show that Myod regulatory regions are directly controlled by Six proteins and that, in the absence of Six1 and Six4, Six2 can compensate. PMID:23637613

  4. Molecular cloning and characterization of a chlorophyll degradation regulatory gene (ZjSGR) from Zoysia japonica.

    PubMed

    Teng, K; Chang, Z H; Xiao, G Z; Guo, W E; Xu, L X; Chao, Y H; Han, L B

    2016-01-01

    The stay-green gene (SGR) is a key regulatory factor for chlorophyll degradation and senescence. However, to date, little is known about SGR in Zoysia japonica. In this study, ZjSGR was cloned, using rapid amplification of cDNA ends-polymerase chain reaction (PCR). The target sequence is 831 bp in length, corresponding to 276 amino acids. Protein BLAST results showed that ZjSGR belongs to the stay-green superfamily. A phylogenetic analysis implied that ZjSGR is most closely related to ZmSGR1. The subcellular localization of ZjSGR was investigated, using an Agrobacterium-mediated transient expression assay in Nicotiana benthamiana. Our results demonstrated that ZjSGR protein is localized in the chloroplasts. Quantitative real time PCR was carried out to investigate the expression characteristics of ZjSGR. The expression level of ZjSGR was found to be highest in leaves, and could be strongly induced by natural senescence, darkness, abscisic acid (ABA), and methyl jasmonate treatment. Moreover, an in vivo function analysis indicated that transient overexpression of ZjSGR could accelerate chlorophyll degradation, up-regulate the expression of SAG113, and activate ABA biosynthesis. Taken together, these results provide evidence that ZjSGR could play an important regulatory role in leaf chlorophyll degradation and senescence in plants at the molecular level. PMID:27173268

  5. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences. PMID:18397498

  6. Adeno-associated virus type 2 rep gene-mediated inhibition of basal gene expression of human immunodeficiency virus type 1 involves its negative regulatory functions.

    PubMed Central

    Oelze, I; Rittner, K; Sczakiel, G

    1994-01-01

    Adeno-associated virus type 2 (AAV-2), a human parvovirus which is apathogenic in adults, inhibits replication and gene expression of human immunodeficiency virus type 1 (HIV-1) in human cells. The rep gene of AAV-2, which was shown earlier to be sufficient for this negative interference, also down-regulated the expression of heterologous sequences driven by the long terminal repeat (LTR) of HIV-1. This effect was observed in the absence of the HIV-1 transactivator Tat, i.e., at basal levels of LTR-driven transcription. In this work, we studied the involvement of functional subsequences of the HIV-1 LTR in rep-mediated inhibition in the absence of Tat. Mutated LTRs driving an indicator gene (cat) were cointroduced into human SW480 cells together with rep alone or with double-stranded DNA fragments or RNA containing sequences of the HIV-1 LTR. The results indicate that rep strongly enhances the function of negative regulatory elements of the LTR. In addition, the experiments revealed a transcribed sequence element located within the TAR-coding sequence termed AHHH (AAV-HIV homology element derived from HIV-1) which is involved in rep-mediated inhibition. The AHHH element is also involved in down-regulation of basal expression levels in the absence of rep, suggesting that AHHH also contributes to negative regulatory functions of the LTR of HIV-1. In contrast, positive regulatory elements of the HIV-1 LTR such as the NF kappa B and SP1 binding sites have no significant influence on the rep-mediated inhibition. Images PMID:8289357

  7. Gene regulatory network inference using fused LASSO on multiple data sets

    PubMed Central

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-01-01

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687

  8. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  9. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  10. DNA sequence of the Serratia marcescens lipoprotein gene

    PubMed Central

    Nakamura, Kenzo; Inouye, Masayori

    1980-01-01

    The Serratia marcescens gene for the outer membrane lipoprotein (lpp) was cloned in λ phage vector Charon 14. The recombinant phage was very unstable, and the lpp gene with a 300-base-pair deletion at the transcription termination site was further cloned in pBR322. The DNA sequence of 834 base pairs encompassing the lpp gene was determined and compared with that of the Escherichia coli lpp gene. The sequence comparisons exhibit several unique features. (i) The promoter region is highly conserved (84% homology) and has an extremely high A+T content (78%) as in E. coli (80%). (ii) The 5′ nontranslated region of the lipoprotein mRNA is also highly conserved (95% homology). (iii) In the DNA sequence corresponding to the signal peptide of this secretory protein, there are three drastic changes, including addition of one base pair and deletion of four base pairs in S. marcescens as compared to E. coli. The resultant alterations in the amino acid sequence, however, do not change the basic properties of the signal peptide, which are assumed to be essential for its function in the secretory mechanism. (iv) The DNA sequence from the amino terminus to the 51st residue of the mature lipoprotein is highly conserved (95% homology) and there is no amino acid substitution. (v) The DNA sequence corresponding to the seven amino acid residues at the carboxyl terminus has only 42% homology, resulting in four amino acid substitutions. (vi) Within the section of 40 base pairs beginning with the termination codon (UAA) and ending immediately before the oligo(T) transcription termination site in the E. coli lpp gene, there is about 60% homology. However, after this section, there is no obvious homology between the two sequences, probably because of a deletion of 300 base pairs at this region. (vii) Seven stable stem-and-loop structures could be formed in the mRNA region. (viii) Alterations in the third position of codons used in the lpp gene suggest that the gene has evolved somewhat

  11. Regulation of SHOOT MERISTEMLESS genes via an upstream-conserved noncoding sequence coordinates leaf development

    PubMed Central

    Uchida, Naoyuki; Townsley, Brad; Chung, Kook-Hyun; Sinha, Neelima

    2007-01-01

    The indeterminate shoot apical meristem of plants is characterized by the expression of the Class 1 KNOTTED1-LIKE HOMEOBOX (KNOX1) genes. KNOX1 genes have been implicated in the acquisition and/or maintenance of meristematic fate. One of the earliest indicators of a switch in fate from indeterminate meristem to determinate leaf primordium is the down-regulation of KNOX1 genes orthologous to SHOOT MERISTEMLESS (STM) in Arabidopsis (hereafter called STM genes) in the initiating primordia. In simple leafed plants, this down-regulation persists during leaf formation. In compound leafed plants, however, KNOX1 gene expression is reestablished later in the developing primordia, creating an indeterminate environment for leaflet formation. Despite this knowledge, most aspects of how STM gene expression is regulated remain largely unknown. Here, we identify two evolutionarily conserved noncoding sequences within the 5′ upstream region of STM genes in both simple and compound leafed species across monocots and dicots. We show that one of these elements is involved in the regulation of the persistent repression and/or the reestablishment of STM expression in the developing leaves but is not involved in the initial down-regulation in the initiating primordia. We also show evidence that this regulation is developmentally significant for leaf formation in the pathway involving ASYMMETRIC LEAVES1/2 (AS1/2) gene expression; these genes are known to function in leaf development. Together, these findings reveal a regulatory point of leaf development mediated through a conserved, noncoding sequence in STM genes. PMID:17898165

  12. Using evolutionary computations to understand the design and evolution of gene and cell regulatory networks

    PubMed Central

    Spirov, Alexander; Holloway, David

    2013-01-01

    This paper surveys modeling approaches for studying the evolution of gene regulatory networks (GRNs). Modeling of the design or ‘wiring’ of GRNs has become increasingly common in developmental and medical biology, as a means of quantifying gene-gene interactions, the response to perturbations, and the overall dynamic motifs of networks. Drawing from developments in GRN ‘design’ modeling, a number of groups are now using simulations to study how GRNs evolve, both for comparative genomics and to uncover general principles of evolutionary processes. Such work can generally be termed evolution in silico. Complementary to these biologically-focused approaches, a now well-established field of computer science is Evolutionary Computations (EC), in which highly efficient optimization techniques are inspired from evolutionary principles. In surveying biological simulation approaches, we discuss the considerations that must be taken with respect to: a) the precision and completeness of the data (e.g. are the simulations for very close matches to anatomical data, or are they for more general exploration of evolutionary principles); b) the level of detail to model (we proceed from ‘coarse-grained’ evolution of simple gene-gene interactions to ‘fine-grained’ evolution at the DNA sequence level); c) to what degree is it important to include the genome’s cellular context; and d) the efficiency of computation. With respect to the latter, we argue that developments in computer science EC offer the means to perform more complete simulation searches, and will lead to more comprehensive biological predictions. PMID:23726941

  13. The Association between Infants' Self-Regulatory Behavior and MAOA Gene Polymorphism

    ERIC Educational Resources Information Center

    Zhang, Minghao; Chen, Xinyin; Way, Niobe; Yoshikawa, Hirokazu; Deng, Huihua; Ke, Xiaoyan; Yu, Weiwei; Chen, Ping; He, Chuan; Chi, Xia; Lu, Zuhong

    2011-01-01

    Self-regulatory behavior in early childhood is an important characteristic that has considerable implications for the development of adaptive and maladaptive functioning. The present study investigated the relations between a functional polymorphism in the upstream region of monoamine oxidase A gene (MAOA) and self-regulatory behavior in a sample…

  14. A gene-specific non-enhancer sequence is critical for expression from the promoter of the small heat shock protein gene αB-crystallin

    PubMed Central

    2014-01-01

    Background Deciphering of the information content of eukaryotic promoters has remained confined to universal landmarks and conserved sequence elements such as enhancers and transcription factor binding motifs, which are considered sufficient for gene activation and regulation. Gene-specific sequences, interspersed between the canonical transacting factor binding sites or adjoining them within a promoter, are generally taken to be devoid of any regulatory information and have therefore been largely ignored. An unanswered question therefore is, do gene-specific sequences within a eukaryotic promoter have a role in gene activation? Here, we present an exhaustive experimental analysis of a gene-specific sequence adjoining the heat shock element (HSE) in the proximal promoter of the small heat shock protein gene, αB-crystallin (cryab). These sequences are highly conserved between the rodents and the humans. Results Using human retinal pigment epithelial cells in culture as the host, we have identified a 10-bp gene-specific promoter sequence (GPS), which, unlike an enhancer, controls expression from the promoter of this gene, only when in appropriate position and orientation. Notably, the data suggests that GPS in comparison with the HSE works in a context-independent fashion. Additionally, when moved upstream, about a nucleosome length of DNA (−154 bp) from the transcription start site (TSS), the activity of the promoter is markedly inhibited, suggesting its involvement in local promoter access. Importantly, we demonstrate that deletion of the GPS results in complete loss of cryab promoter activity in transgenic mice. Conclusions These data suggest that gene-specific sequences such as the GPS, identified here, may have critical roles in regulating gene-specific activity from eukaryotic promoters. PMID:24589182

  15. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  16. A saturation screen for cis-acting regulatory DNA in the Hox genes of Ciona intestinalis

    SciTech Connect

    Keys, David N.; Lee, Byung-in; Di Gregorio, Anna; Harafuji, Naoe; Detter, Chris; Wang, Mei; Kahsai, Orsalem; Ahn, Sylvia; Arellano, Andre; Zhang, Quin; Trong, Stephan; Doyle, Sharon A.; Satoh, Noriyuki; Satou, Yutaka; Saiga, Hidetoshi; Christian, Allen; Rokhsar, Dan; Hawkins, Trevor L.; Levine, Mike; Richardson, Paul

    2005-01-05

    A screen for the systematic identification of cis-regulatory elements within large (>100 kb) genomic domains containing Hox genes was performed by using the basal chordate Ciona intestinalis. Randomly generated DNA fragments from bacterial artificial chromosomes containing two clusters of Hox genes were inserted into a vector upstream of a minimal promoter and lacZ reporter gene. A total of 222 resultant fusion genes were separately electroporated into fertilized eggs, and their regulatory activities were monitored in larvae. In sum, 21 separable cis-regulatory elements were found. These include eight Hox linked domains that drive expression in nested anterior-posterior domains of ectodermally derived tissues. In addition to vertebrate-like CNS regulation, the discovery of cis-regulatory domains that drive epidermal transcription suggests that C. intestinalis has arthropod-like Hox patterning in the epidermis.

  17. Discrete dynamical system modelling for gene regulatory networks of 5-hydroxymethylfural tolerance for ethanologenic yeast

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Composed of linear difference equations, a discrete dynamic system model was designed to reconstruct transcriptional regulations in gene regulatory networks in response to 5-hydroxymethylfurfural, a bioethanol conversion inhibitor for ethanologenic yeast Saccharomyces cerevisiae. The modeling aims ...

  18. Analyzing S-adenosylhomocysteine hydrolase gene sequences in deuterostome genomes.

    PubMed

    Zhao, Jing-Nan; Wang, Yuan; Zhao, Bo-Sheng; Chen, Ling-Ling

    2009-12-01

    S-adenosylhomocysteine hydrolase (SAHH) gene sequences of sea-urchin, two amphioxus, sea-squirt and eight vertebrates are comparatively analyzed in the current analysis. Although SAHH protein sequences are highly conserved in these species, their nucleotide sequences are much different, ranging from 5,446 bp in amphioxus to 40,174 bp in zebra fish. The length divergence is mainly caused by distinct introns in some species. SAHH genes in amphioxus (or sea-urchin), sea-squirt and vertebrates are composed of eight, nine and ten exons, respectively. Sequence alignment shows that exon 3 in amphioxus and sea-urchin is similar to exons 3 + 4 in vertebrates, exon 5 in amphioxus and sea-urchin is similar to exons 5 + 6 in sea-squirt, and the two exons are fused into exon 6 in vertebrates. Furthermore, exon 7 in sea-squirt is similar to exons 7 + 8 in vertebrates, indicating that exon-fission and exon-fusion events have been taken place during the evolution of deuterostome SAHH genes. Active sites and NAD+-binding sites are located in exons 2 7 in amphioxus, which are dispersed into much more exons along with the evolution of vertebrates. It is speculated that ten-exon organization of SAHH gene occurred after the separation of invertebrates and vertebrates. Synonymous and non-synonymous substitution analysis shows that negative selection plays a dominant role in the evolution of SAHH genes. Phylogenetic analysis shows that SAHH genes in amphioxus, sea-urchin and sea-squirt form a cluster and locate at the base of neighbor-joining tree, suggesting that they are the archetype of vertebrate SAHH genes. PMID:19795919

  19. Epigenomic annotation of gene regulatory alterations during evolution of the primate brain.

    PubMed

    Vermunt, Marit W; Tan, Sander C; Castelijns, Bas; Geeven, Geert; Reinink, Peter; de Bruijn, Ewart; Kondova, Ivanela; Persengiev, Stephan; Bontrop, Ronald; Cuppen, Edwin; de Laat, Wouter; Creyghton, Menno P

    2016-03-01

    Although genome sequencing has identified numerous noncoding alterations between primate species, which of those are regulatory and potentially relevant to the evolution of the human brain is unclear. Here we annotated cis-regulatory elements (CREs) in the human, rhesus macaque and chimpanzee genomes using chromatin immunoprecipitation followed by sequencing (ChIP-seq) in different anatomical regions of the adult brain. We found high similarity in the genomic positioning of rhesus macaque and human CREs, suggesting that the majority of these elements were already present in a common ancestor 25 million years ago. Most of the observed regulatory changes between humans and rhesus macaques occurred before the ancestral separation of humans and chimpanzees, leaving a modest set of regulatory elements with predicted human specificity. Our data refine previous predictions and hypotheses on the consequences of genomic changes between primate species and allow the identification of regulatory alterations relevant to the evolution of the brain. PMID:26807951

  20. Delineation of the regulatory region sequences of Agrobacterium tumefaciens virB operon.

    PubMed Central

    Das, A; Pazour, G J

    1989-01-01

    A virB-lacZ translational fusion was constructed to monitor expression of the Agrobacterium tumefaciens virB operon. Expression of the fusion gene was dependent on the presence of pTiA6 virA, virG, and a plant factor acetosyringone. Analysis of deletion mutants, constructed by exonuclease Bal31 digestion, showed that 68 residues upstream of the virB transcription initiation site was necessary for its expression. A TT----CC substitution at positions -62 and -61 led to a 7 fold reduction in virB expression. The virB upstream region contains a tetradecameric sequence, dPuT/ATDCAATGHAAPy (D = A, G or T; H = A, C or T), that is conserved in the non-transcribed regions of all vir genes. Alteration of the position of this sequence relative to the promoter region sequences had a drastic negative effect on virB expression. PMID:2748333

  1. Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks

    PubMed Central

    Koschützki, Dirk; Schreiber, Falk

    2008-01-01

    The structural analysis of biological networks includes the ranking of the vertices based on the connection structure of a network. To support this analysis we discuss centrality measures which indicate the importance of vertices, and demonstrate their applicability on a gene regulatory network. We show that common centrality measures result in different valuations of the vertices and that novel measures tailored to specific biological investigations are useful for the analysis of biological networks, in particular gene regulatory networks. PMID:19787083

  2. The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences

    PubMed Central

    Kensche, Philip Reiner; Hoeijmakers, Wieteke Anna Maria; Toenhake, Christa Geeke; Bras, Maaike; Chappell, Lia; Berriman, Matthew; Bártfai, Richárd

    2016-01-01

    In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen. PMID:26578577

  3. Inferring gene regulatory networks via nonlinear state-space models and exploiting sparsity.

    PubMed

    Noor, Amina; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem N

    2012-01-01

    This paper considers the problem of learning the structure of gene regulatory networks from gene expression time series data. A more realistic scenario when the state space model representing a gene network evolves nonlinearly is considered while a linear model is assumed for the microarray data. To capture the nonlinearity, a particle filter-based state estimation algorithm is considered instead of the contemporary linear approximation-based approaches. The parameters characterizing the regulatory relations among various genes are estimated online using a Kalman filter. Since a particular gene interacts with a few other genes only, the parameter vector is expected to be sparse. The state estimates delivered by the particle filter and the observed microarray data are then subjected to a LASSO-based least squares regression operation which yields a parsimonious and efficient description of the regulatory network by setting the irrelevant coefficients to zero. The performance of the aforementioned algorithm is compared with the extended Kalman filter (EKF) and Unscented Kalman Filter (UKF) employing the Mean Square Error (MSE) as the fidelity criterion in recovering the parameters of gene regulatory networks from synthetic data and real biological data. Extensive computer simulations illustrate that the proposed particle filter-based network inference algorithm outperforms EKF and UKF, and therefore, it can serve as a natural framework for modeling gene regulatory networks with nonlinear and sparse structure. PMID:22350207

  4. Spliced synthetic genes as internal controls in RNA sequencing experiments.

    PubMed

    Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

    2016-09-01

    RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome. PMID:27502218

  5. Identification of a regulatory domain controlling the Nppa-Nppb gene cluster during heart development and stress.

    PubMed

    Sergeeva, Irina A; Hooijkaas, Ingeborg B; Ruijter, Jan M; van der Made, Ingeborg; de Groot, Nina E; van de Werken, Harmen J G; Creemers, Esther E; Christoffels, Vincent M

    2016-06-15

    The paralogous genes Nppa and Nppb are organized in an evolutionarily conserved cluster and provide a valuable model for studying co-regulation and regulatory landscape organization during heart development and disease. Here, we analyzed the chromatin conformation, epigenetic status and enhancer potential of sequences of the Nppa-Nppb cluster in vivo Our data indicate that the regulatory landscape of the cluster is present within a 60-kb domain centered around Nppb Both promoters and several potential regulatory elements interact with each other in a similar manner in different tissues and developmental stages. The distribution of H3K27ac and the association of Pol2 across the locus changed during cardiac hypertrophy, revealing their potential involvement in stress-mediated gene regulation. Functional analysis of double-reporter transgenic mice revealed that Nppa and Nppb share developmental, but not stress-response, enhancers, responsible for their co-regulation. Moreover, the Nppb promoter was required, but not sufficient, for hypertrophy-induced Nppa expression. In summary, the developmental regulation and stress response of the Nppa-Nppb cluster involve the concerted action of multiple enhancers and epigenetic changes distributed across a structurally rigid regulatory domain. PMID:27048739

  6. Combining Hi-C data with phylogenetic correlation to predict the target genes of distal regulatory elements in human genome

    PubMed Central

    Lu, Yulan; Zhou, Yuanpeng; Tian, Weidong

    2013-01-01

    Defining the target genes of distal regulatory elements (DREs), such as enhancer, repressors and insulators, is a challenging task. The recently developed Hi-C technology is designed to capture chromosome conformation structure by high-throughput sequencing, and can be potentially used to determine the target genes of DREs. However, Hi-C data are noisy, making it difficult to directly use Hi-C data to identify DRE–target gene relationships. In this study, we show that DREs–gene pairs that are confirmed by Hi-C data are strongly phylogenetic correlated, and have thus developed a method that combines Hi-C read counts with phylogenetic correlation to predict long-range DRE–target gene relationships. Analysis of predicted DRE–target gene pairs shows that genes regulated by large number of DREs tend to have essential functions, and genes regulated by the same DREs tend to be functionally related and co-expressed. In addition, we show with a couple of examples that the predicted target genes of DREs can help explain the causal roles of disease-associated single-nucleotide polymorphisms located in the DREs. As such, these predictions will be of importance not only for our understanding of the function of DREs but also for elucidating the causal roles of disease-associated noncoding single-nucleotide polymorphisms. PMID:24003029

  7. Tissue Specificity and Sex-Specific Regulatory Variation Permit the Evolution of Sex-Biased Gene Expression.

    PubMed

    Dean, Rebecca; Mank, Judith E

    2016-09-01

    Genetic correlations between males and females are often thought to constrain the evolution of sexual dimorphism. However, sexually dimorphic traits and the underlying sexually dimorphic gene expression patterns are often rapidly evolving. We explore this apparent paradox by measuring the genetic correlation in gene expression between males and females (Cmf) across broad evolutionary timescales, using two RNA-sequencing data sets spanning multiple populations and multiple species. We find that unbiased genes have higher Cmf than sex-biased genes, consistent with intersexual genetic correlations constraining the evolution of sexual dimorphism. However, we found that highly sex-biased genes (both male and female biased) also had higher tissue specificity, and unbiased genes had greater expression breadth, suggesting that pleiotropy may constrain the breakdown of intersexual genetic correlations. Finally, we show that genes with high Cmf showed some degree of sex-specific changes in gene expression in males and females. Together, our results suggest that genetic correlations between males and females may be less important in constraining the evolution of sex-biased gene expression than pleiotropy. Sex-specific regulatory variation and tissue specificity may resolve the paradox of widespread sex bias within a largely shared genome. PMID:27501094

  8. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite.

    PubMed

    Borodovsky, Mark; Lomsadze, Alex

    2014-01-01

    This unit describes how to use several gene-finding programs from the GeneMark line developed for finding protein-coding ORFs in genomic DNA of prokaryotic species, in genomic DNA of eukaryotic species with intronless genes, in genomes of viruses and phages, and in prokaryotic metagenomic sequences, as well as in EST sequences with spliced-out introns. These bioinformatics tools were demonstrated to have state-of-the-art accuracy, and have been frequently used for gene annotation in novel nucleotide sequences. An additional advantage of these sequence-analysis tools is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self-training (unsupervised training). PMID:24510847

  9. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite.

    PubMed

    Borodovsky, Mark; Lomsadze, Alex

    2011-09-01

    This unit describes how to use several gene-finding programs from the GeneMark line developed for finding protein-coding ORFs in genomic DNA of prokaryotic species, in genomic DNA of eukaryotic species with intronless genes, in genomes of viruses and phages, and in prokaryotic metagenomic sequences, as well as in EST sequences with spliced-out introns. These bioinformatics tools were demonstrated to have state-of-the-art accuracy and have been frequently used for gene annotation in novel nucleotide sequences. An additional advantage of these sequence-analysis tools is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self-training (unsupervised training). PMID:21901741

  10. Isolation of nine gene sequences induced by silica in murine macrophages

    SciTech Connect

    Segade, F.; Claudio, E.; Wrobel, K.; Ramos, S.; Lazo, P.S.

    1995-03-01

    Macrophage activation by silica is the initial step in the development of silicosis. To identify genes that might be involved in silica-mediated activation, RAW 264.7 mouse macrophages were treated with silica for 48 h, and a subtracted cDNA library enriched for silica-induced genes (SIG) was constructed and differently screened. Nine cDNA clones (designated SIG-12, -14, -20, -41, -61, -81, -91, and -111) were partially sequenced and compared with sequences in GenBank/EMBL databases. SIG-12, -14, and -20 corresponded to the genes for ribosomal proteins L13A, L32, and L26, respectively. SIG-61 is the mouse homologue of p21 RhoC. SIG-91 is identical to the 67-kDa high-affinity laminin receptor. Four genes were not identified and are novel. All of the mRNAs corresponding to the nine cloned cDNAs were inducible by silica. Steady-state levels of mRNAs in RAW 264.7 cells treated with various macrophage activators and inducers of signal transduction pathways were determined. A complex pattern of induction and repression was found, indicating that upon phagocytosis of silica particles, many regulatory mechanisms of genes expression are simultaneously triggered. 55 refs., 4 figs., 1 tab.

  11. Sequence and structural organization of a nif A-like gene and part of a nifB-like gene of Herbaspirillum seropedicae strain Z78.

    PubMed

    Souza, E M; Funayama, S; Rigo, L U; Yates, M G; Pedrosa, F O

    1991-07-01

    The deduced amino acid sequence derived from the sequence of a fragment of DNA from the free-living diazotroph Herbaspirillum seropedicae was aligned to the homologous protein sequences encoded by the nifA genes from Azorhizobium caulinodans, Rhizobium leguminosarum, Rhizobium meliloti and Klebsiella pneumoniae. High similarity was found in the central domain and in the C-terminal region. The H. seropedicae putative NifA sequence was also found to contain an interdomain linker similar to that conserved among rhizobial NifA proteins, but not K. pneumoniae or Azotobacter vinelandii. Analysis of the regulatory sequences found 5' from nifA indicated that the expression of this gene in H. seropedicae is likely to be controlled by NifA, NtrC and RpoN, as judged by the presence of specific NifA- and NtrC-binding sites and characteristic -24/-12 promoters. Possible additional regulatory features included an 'anaerobox' and a site for integration host factor. The N-terminus of another open reading frame was found 3' from nifA and tentatively identified as nifB by amino acid sequence comparison. The putative nifB promoter sequence suggests that expression of H. seropedicae nifB may be activated by NifA and dependent on RpoN. PMID:1840608

  12. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  13. Widespread contribution of transposable elements to the innovation of gene regulatory networks.

    PubMed

    Sundaram, Vasavi; Cheng, Yong; Ma, Zhihai; Li, Daofeng; Xing, Xiaoyun; Edge, Peter; Snyder, Michael P; Wang, Ting

    2014-12-01

    Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF-TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution. PMID:25319995

  14. Widespread contribution of transposable elements to the innovation of gene regulatory networks

    PubMed Central

    Sundaram, Vasavi; Cheng, Yong; Ma, Zhihai; Li, Daofeng; Xing, Xiaoyun; Edge, Peter

    2014-01-01

    Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF–TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution. PMID:25319995

  15. A validated gene regulatory network and GWAS identifies early regulators of T cell-associated diseases.

    PubMed

    Gustafsson, Mika; Gawel, Danuta R; Alfredsson, Lars; Baranzini, Sergio; Björkander, Janne; Blomgran, Robert; Hellberg, Sandra; Eklund, Daniel; Ernerudh, Jan; Kockum, Ingrid; Konstantinell, Aelita; Lahesmaa, Riita; Lentini, Antonio; Liljenström, H Robert I; Mattson, Lina; Matussek, Andreas; Mellergård, Johan; Mendez, Melissa; Olsson, Tomas; Pujana, Miguel A; Rasool, Omid; Serra-Musach, Jordi; Stenmarker, Margaretha; Tripathi, Subhash; Viitala, Miro; Wang, Hui; Zhang, Huan; Nestor, Colm E; Benson, Mikael

    2015-11-11

    Early regulators of disease may increase understanding of disease mechanisms and serve as markers for presymptomatic diagnosis and treatment. However, early regulators are difficult to identify because patients generally present after they are symptomatic. We hypothesized that early regulators of T cell-associated diseases could be found by identifying upstream transcription factors (TFs) in T cell differentiation and by prioritizing hub TFs that were enriched for disease-associated polymorphisms. A gene regulatory network (GRN) was constructed by time series profiling of the transcriptomes and methylomes of human CD4(+) T cells during in vitro differentiation into four helper T cell lineages, in combination with sequence-based TF binding predictions. The TFs GATA3, MAF, and MYB were identified as early regulators and validated by ChIP-seq (chromatin immunoprecipitation sequencing) and small interfering RNA knockdowns. Differential mRNA expression of the TFs and their targets in T cell-associated diseases supports their clinical relevance. To directly test if the TFs were altered early in disease, T cells from patients with two T cell-mediated diseases, multiple sclerosis and seasonal allergic rhinitis, were analyzed. Strikingly, the TFs were differentially expressed during asymptomatic stages of both diseases, whereas their targets showed altered expression during symptomatic stages. This analytical strategy to identify early regulators of disease by combining GRNs with genome-wide association studies may be generally applicable for functional and clinical studies of early disease development. PMID:26560356

  16. Sieve-based relation extraction of gene regulatory networks from biological literature

    PubMed Central

    2015-01-01

    Background Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. Results We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice

  17. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development. PMID:26632170

  18. Phenotype Sequencing: Identifying the Genes That Cause a Phenotype Directly from Pooled Sequencing of Independent Mutants

    PubMed Central

    Harper, Marc A.; Chen, Zugen; Toy, Traci; Machado, Iara M. P.; Nelson, Stanley F.; Liao, James C.; Lee, Christopher J.

    2011-01-01

    Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50–100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a “Phenotype Sequencing” approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost $7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only $1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only $110–$340. PMID:21364744

  19. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data

    PubMed Central

    Faria, José P.; Overbeek, Ross; Taylor, Ronald C.; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S.

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental

  20. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data.

    PubMed

    Faria, José P; Overbeek, Ross; Taylor, Ronald C; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same "ON" and "OFF" gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions

  1. Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution.

    PubMed

    Okpeku, Moses; Esmailizadeh, Ali; Adeola, Adeniyi C; Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M; Imumorin, Ikhide G; Peters, Sunday O; Zhang, Jiajin; Dong, Yang; Wang, Wen

    2016-01-01

    The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats' populations. Fu and Li tests were significantly positive but Tajima's D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat IRF3

  2. A dual cis-regulatory code links IRF8 to constitutive and inducible gene expression in macrophages

    PubMed Central

    Mancino, Alessandra; Termanini, Alberto; Barozzi, Iros; Ghisletti, Serena; Ostuni, Renato; Prosperini, Elena; Ozato, Keiko

    2015-01-01

    The transcription factor (TF) interferon regulatory factor 8 (IRF8) controls both developmental and inflammatory stimulus-inducible genes in macrophages, but the mechanisms underlying these two different functions are largely unknown. One possibility is that these different roles are linked to the ability of IRF8 to bind alternative DNA sequences. We found that IRF8 is recruited to distinct sets of DNA consensus sequences before and after lipopolysaccharide (LPS) stimulation. In resting cells, IRF8 was mainly bound to composite sites together with the master regulator of myeloid development PU.1. Basal IRF8–PU.1 binding maintained the expression of a broad panel of genes essential for macrophage functions (such as microbial recognition and response to purines) and contributed to basal expression of many LPS-inducible genes. After LPS stimulation, increased expression of IRF8, other IRFs, and AP-1 family TFs enabled IRF8 binding to thousands of additional regions containing low-affinity multimerized IRF sites and composite IRF–AP-1 sites, which were not premarked by PU.1 and did not contribute to the basal IRF8 cistrome. While constitutively expressed IRF8-dependent genes contained only sites mediating basal IRF8/PU.1 recruitment, inducible IRF8-dependent genes contained variable combinations of constitutive and inducible sites. Overall, these data show at the genome scale how the same TF can be linked to constitutive and inducible gene regulation via distinct combinations of alternative DNA-binding sites. PMID:25637355

  3. Methylation of B-hordein genes in barley endosperm is inversely correlated with gene activity and affected by the regulatory gene Lys3.

    PubMed Central

    Sørensen, M B

    1992-01-01

    The methylation status of B-hordein genes in the developing barley endosperm was analyzed by digestion with methylation-sensitive restriction enzymes. Southern blotting revealed specific demethylation of Hpa II sites in DNA from wild-type endosperm, whereas leaf DNA and lys3a mutant endosperm DNA were highly methylated at these sites. Similar methylation patterns were observed at an Ava I site situated at position -260 in the B-hordein promoter. This differential methylation was confirmed by genomic sequencing with ligation-mediated PCR. The analyzed sequence covers most of the B-hordein promoter and includes 10 CpGs from the promoter and 4 CpGs from the adjacent coding region. These sites were all hypomethylated in wild-type endosperm, whereas--except for three partially methylated sites--full methylation was seen in leaf DNA. The four sites in the coding region were partially methylated in lys3a endosperm DNA, but the promoter sites remained highly methylated. The possible role of methylation in the regulatory function of the Lys3 gene product is discussed. Images PMID:1570338

  4. Sequence analysis of mouse vomeronasal receptor gene clusters reveals common promoter motifs and a history of recent expansion

    PubMed Central

    Lane, Robert P.; Cutforth, Tyler; Axel, Richard; Hood, Leroy; Trask, Barbara J.

    2002-01-01

    We have analyzed the organization and sequence of 73 V1R genes encoding putative pheromone receptors to identify regulatory features and characterize the evolutionary history of the V1R family. The 73 V1Rs arose from seven ancestral genes around the time of mouse–rat speciation through large local duplications, and this expansion may contribute to speciation events. Orthologous V1R genes appear to have been lost during primate evolution. Exceptional noncoding homology is observed across four V1R subfamilies at one cluster and thus may be important for locus-specific transcriptional regulation. PMID:11752409

  5. Detecting sequence homology at the gene cluster level with MultiGeneBlast.

    PubMed

    Medema, Marnix H; Takano, Eriko; Breitling, Rainer

    2013-05-01

    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualization offered by MultiGeneBlast allows users to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions. The tool is fully equipped with applications to generate search databases from GenBank or from the user's own sequence data. Finally, an architecture search mode allows searching for gene clusters with novel configurations, by detecting genomic regions with any user-specified combination of genes. Sources, precompiled binaries, and a graphical tutorial of MultiGeneBlast are freely available from http://multigeneblast.sourceforge.net/. PMID:23412913

  6. Reconstruction of large-scale gene regulatory networks using Bayesian model averaging.

    PubMed

    Kim, Haseong; Gelenbe, Erol

    2012-09-01

    Gene regulatory networks provide the systematic view of molecular interactions in a complex living system. However, constructing large-scale gene regulatory networks is one of the most challenging problems in systems biology. Also large burst sets of biological data require a proper integration technique for reliable gene regulatory network construction. Here we present a new reverse engineering approach based on Bayesian model averaging which attempts to combine all the appropriate models describing interactions among genes. This Bayesian approach with a prior based on the Gibbs distribution provides an efficient means to integrate multiple sources of biological data. In a simulation study with maximum of 2000 genes, our method shows better sensitivity than previous elastic-net and Gaussian graphical models, with a fixed specificity of 0.99. The study also shows that the proposed method outperforms the other standard methods for a DREAM dataset generated by nonlinear stochastic models. In brain tumor data analysis, three large-scale networks consisting of 4422 genes were built using the gene expression of non-tumor, low and high grade tumor mRNA expression samples, along with DNA-protein binding affinity information. We found that genes having a large variation of degree distribution among the three tumor networks are the ones that see most involved in regulatory and developmental processes, which possibly gives a novel insight concerning conventional differentially expressed gene analysis. PMID:22987132

  7. Hypoxia-induced protein binding to O2-responsive sequences on the tyrosine hydroxylase gene.

    PubMed

    Norris, M L; Millhorn, D E

    1995-10-01

    We reported recently that the gene that encodes tyrosine hydroxylase (TH), the rate-limiting enzyme in the biosynthesis of catecholamines, is regulated by hypoxia in the dopaminergic cells of the mammalian carotid body (Czyzyk-Krzeska, M. F., Bayliss, D. A., Lawson, E. E. & Millhorn, D. E. (1992) J. Neurochem. 58, 1538-1546) and in pheochromocytoma (PC12) cells (Czyzyk-Krzeska, M. F., Furnari, B. A., Lawson, E. E. & Millhorn, D. E. (1994) J. Biol. Chem. 269, 760-764). Regulation of this gene during low O2 conditions occurs at both the level of transcription and RNA stability. Increased transcription during hypoxia is regulated by a region of the proximal promoter that extends from -284 to + 27 bases, relative to transcription start site. The present study was undertaken to further characterize the sequences that confer O2 responsiveness of the TH gene and to identify hypoxia-induced protein interactions with these sequences. Results from chloramphenicol acetyltransferase assays identified a region between bases -284 and -150 that contains the essential sequences for O2 regulation. This region contains a number of regulatory elements including AP1, AP2, and HIF-1. Gel shift assays revealed enhanced protein interactions at the AP1 and HIF-1 elements of the native gene. Further investigations using supershift and shift-Western analysis showed that c-Fos and JunB bind to the AP1 element during hypoxia and that these protein levels are stimulated by hypoxia. Mutation of the AP1 sequence prevented stimulation of transcription of the TH-chloramphenicol acetyltransferase reporter gene by hypoxia. PMID:7559551

  8. cis-Acting sequences required for expression of the divergently transcribed Drosophila melanogaster Sgs-7 and Sgs-8 glue protein genes

    SciTech Connect

    Hofmann, A.; Garfinkel, M.D.; Meyerowitz, E.M. )

    1991-06-01

    The Sgs-7 and Sgs-8 glue genes at 68C are divergently transcribed and are separated by 475 bp. Fusion genes with Adh or lacZ coding sequences were constructed, and the expression of these genes, with different amounts of upstream sequences present, was tested by a transient expression procedure and by germ line transformation. A cis-acting element for both genes is located asymmetrically in the intergenic region between {minus}211 and {minus}43 bp relative to Sgs-7. It is required for correct expression of both genes. This element can confer the stage- and tissue-specific expression pattern of glue genes on a heterologous promoter. An 86-bp portion of the element, from {minus}133 to {minus}48 bp relative to Sgs-7, is shown to be capable of enhancing the expression of a truncated and therefore weakly expressed Sgs-3 fusion gene. Recently described common sequence motifs of glue gene regulatory elements.

  9. Distinct and Competitive Regulatory Patterns of Tumor Suppressor Genes and Oncogenes in Ovarian Cancer

    PubMed Central

    Zhao, Min; Sun, Jingchun; Zhao, Zhongming

    2012-01-01

    Background So far, investigators have found numerous tumor suppressor genes (TSGs) and oncogenes (OCGs) that control cell proliferation and apoptosis during cancer development. Furthermore, TSGs and OCGs may act as modulators of transcription factors (TFs) to influence gene regulation. A comprehensive investigation of TSGs, OCGs, TFs, and their joint target genes at the network level may provide a deeper understanding of the post-translational modulation of TSGs and OCGs to TF gene regulation. Methodology/Principal Findings In this study, we developed a novel computational framework for identifying target genes of TSGs and OCGs using TFs as bridges through the integration of protein-protein interactions and gene expression data. We applied this pipeline to ovarian cancer and constructed a three-layer regulatory network. In the network, the top layer was comprised of modulators (TSGs and OCGs), the middle layer included TFs, and the bottom layer contained target genes. Based on regulatory relationships in the network, we compiled TSG and OCG profiles and performed clustering analyses. Interestingly, we found TSGs and OCGs formed two distinct branches. The genes in the TSG branch were significantly enriched in DNA damage and repair, regulating macromolecule metabolism, cell cycle and apoptosis, while the genes in the OCG branch were significantly enriched in the ErbB signaling pathway. Remarkably, their specific targets showed a reversed functional enrichment in terms of apoptosis and the ErbB signaling pathway: the target genes regulated by OCGs only were enriched in anti-apoptosis and the target genes regulated by TSGs only were enriched in the ErbB signaling pathway. Conclusions/Significance This study provides the first comprehensive investigation of the interplay of TSGs and OCGs in a regulatory network modulated by TFs. Our application in ovarian cancer revealed distinct regulatory patterns of TSGs and OCGs, suggesting a competitive regulatory mechanism acting

  10. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  11. Sequence variations in the FAD2 gene in seeded pumpkins.

    PubMed

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-01-01

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2. PMID:26782391

  12. The nucleotide sequences of several tRNA genes from rat mitochondria: common features and relatedness to homologous species.

    PubMed Central

    Cantatore, P; De Benedetto, C; Gadaleta, G; Gallerani, R; Kroon, A M; Holtrop, M; Lanave, C; Pepe, G; Quagliariello, C; Saccone, C; Sbisa, E

    1982-01-01

    We have determined the nucleotide sequences of thirteen rat mt tRNA genes. The features of the primary and secondary structures of these tRNAs show that those for Gln, Ser, and f-Met resemble, while those for Lys, Cys, and Trp depart strikingly from the universal type. The remainder are slightly abnormal. Among many mammalian mt DNA sequences, those of mt tRNA genes are highly conserved, thus suggesting for those genes an additional, perhaps regulatory, function. A simple evolutionary relationship between the tRNAs of animal mitochondria and those of eukaryotic cytoplasm, of lower eukaryotic mitochondria or of prokaryotes, is not evident owing to the extreme divergence of the tRNA sequences in the two groups. However, a slightly higher homology does exist between a few animal mt tRNAs and those from prokaryotes or from lower eukaryotic mitochondria. PMID:7099963

  13. Variable Genome Sequences of the Murine Pneumotropic Virus (Polyomaviridae) Regulatory Region Isolated from an Infected Mouse Tissue Viral Suspension

    PubMed Central

    Libbey, Jane E.

    2016-01-01

    The murine pneumotropic virus genome, isolated from an infected murine tissue homogenate, was sequenced to completion. The lungs, liver, spleen, and kidneys were the source of the tissue homogenate in order to mirror the heterogeneity of the virus population in vivo. The regulatory region sequence was found to be highly variable. PMID:27231357

  14. Variable Genome Sequences of the Murine Pneumotropic Virus (Polyomaviridae) Regulatory Region Isolated from an Infected Mouse Tissue Viral Suspension.

    PubMed

    Libbey, Jane E; Fujinami, Robert S

    2016-01-01

    The murine pneumotropic virus genome, isolated from an infected murine tissue homogenate, was sequenced to completion. The lungs, liver, spleen, and kidneys were the source of the tissue homogenate in order to mirror the heterogeneity of the virus population in vivo The regulatory region sequence was found to be highly variable. PMID:27231357

  15. In trangenic rice, alpha- and beta-tubulin regulatory sequences control GUS amount and distribution through intron mediated enhancement and intron dependent spatial expression.

    PubMed

    Gianì, Silvia; Altana, Andrea; Campanoni, Prisca; Morello, Laura; Breviario, Diego

    2009-04-01

    The genomic upstream sequence of the rice tubulin gene OsTub6 has been cloned, sequenced and characterized. The 5'UTR sequence is interrupted by a 446 bp long leader intron. This feature is shared with two other rice beta-tubulin genes (OsTub4 and OsTub1) that, together with OsTub6, group in the same clade in the evolutionary phylogenetic tree of plant beta-tubulins. Similarly to OsTub4, the leader intron of OsTub6 is capable of sustaining intron mediated enhancement (IME) of gene expression, in transient expression assays. A general picture is drawn for three rice alpha-tubulin and two rice beta-tubulin genes in which the first intron of the coding sequence for the formers and the intron present in the 5'UTR for the latters, are important elements for controlling gene expression. We used OsTua2:GUS, OsTua3:GUS, OsTub4:GUS and OsTub6:GUS chimeric constructs to investigate the in vivo pattern of beta-glucuronidase (GUS) expression in transgenic rice plants. The influence of the regulatory introns on expression patterns was evaluated for two of them, OsTua2 and OsTub4. We have thus characterized distinct patterns of expression attributable to each tubulin isotype and we have shown that the presence of the regulatory intron can greatly influence both the amount and the actual site of expression. We propose the term Intron Dependent Spatial Expression (IDSE) to highlight this latter effect. PMID:18668337

  16. Regulatory network analysis of transcription factors, microRNAs, target genes and host genes in human multiple myeloma.

    PubMed

    Huang, Zhuoyan; Xu, Zhiwen; Kunhao Wang, Kunhao Wang; Wang, Ning; Wang, Shang

    2015-11-01

    In recent years, molecular biologists have achieved great advance in micro RNA (miRNA) and gene investigation about the pathogenesis of multiple myeloma (MM). Existing research data of the transcription factors (TFs) and miRNAs is disperse and unorganized, which prevents researchers from investigating the mechanism and analyze regulatory pathways of MM systematically. In our research, regulatory interactions among miRNAs, TFs, host genes and target genes were imported to construct regulatory networks at three levels, including the abnormally expressed network and the related network as well as the global network. The abnormally expressed network was primary investigated cause it was an experimentally validated topological network, and it systematically explained the regulatory mechanism of MM. Its outstanding significance lies in that if we correct each abnormally expressed gene and miRNA to normal expression level by transcriptional control adjustment, thus the whole genetic expression network will return to normal state, and MM may not relapse. Additionally, analyses and comparisons to upstream as well as downstream of abnormally expressed miRNAs and genes in three networks highlighted some important regulators and key signaling pathways. For example, STAT3 and hsa-miR-125b, PIAS3 and hsa-miR-21 respectively formed self adaptation feedback regulations. The current research proposed a novel perspective to systematically explained the regulatory mechanism of MM and may contribute to further research and therapy of carcinomas. PMID:26687742

  17. Cloning, Sequencing, and Characterization of a Gene Cluster Involved in EDTA Degradation from the Bacterium BNC1

    PubMed Central

    Bohuslavek, Jan; Payne, Jason W.; Liu, Yong; Bolton, Harvey; Xun, Luying

    2001-01-01

    EDTA is a chelating agent, widely used in many industries. Because of its ability to mobilize heavy metals and radionuclides, it can be an environmental pollutant. The EDTA monooxygenases that initiate EDTA degradation have been purified and characterized in bacterial strains BNC1 and DSM 9103. However, the genes encoding the enzymes have not been reported. The EDTA monooxygenase gene was cloned by probing a genomic library of strain BNC1 with a probe generated from the N-terminal amino acid sequence of the monooxygenase. Sequencing of the cloned DNA fragment revealed a gene cluster containing eight genes. Two of the genes, emoA and emoB, were expressed in Escherichia coli, and the gene products, EmoA and EmoB, were purified and characterized. Both experimental data and sequence analysis showed that EmoA is a reduced flavin mononucleotide-utilizing monooxygenase and that EmoB is an NADH:flavin mononucleotide oxidoreductase. The two-enzyme system oxidized EDTA to ethylenediaminediacetate (EDDA) and nitrilotriacetate (NTA) to iminodiacetate (IDA) with the production of glyoxylate. The emoA and emoB genes were cotranscribed when BNC1 cells were grown on EDTA. Other genes in the cluster encoded a hypothetical transport system, a putative regulatory protein, and IDA oxidase that oxidizes IDA and EDDA. We concluded that this gene cluster is responsible for the initial steps of EDTA and NTA degradation. PMID:11157232

  18. A novel positive regulatory element for exfoliative toxin A gene expression in Staphylococcus aureus.

    PubMed

    Sakurai, Susumu; Suzuki, Hitoshi; Hata, Toshiaki; Yoshizawa, Yukio; Nakayama, Ritsuko; Machida, Katsuhiko; Masuda, Shogo; Tsukiyama, Takashi

    2004-04-01

    A 1.4 kb positive regulatory element (ETA(exp)) that controls staphylococcal exfoliative toxin A (sETA) transcription was cloned from Staphylococcus aureus. ETA(exp) is located upstream of the cloned 5.8 kb eta gene (etaJ1) obtained from the chomosomal DNA of S. aureus ZM, the standard ETA-producing strain. The cETA prepared from an Escherichia coli transformant into which the recombinant plasmid petaJ1 (5.8 kb eta/pUC9) had been introduced was expressed at high levels in the culture supernatant and the ammonium-sulfate-precipitated culture supernatant fraction as shown by immunoblotting and the single radial immunodiffusion test. However, cETA produced by the recombinant plasmid petaJ3 containing the 1.7 kb eta sequence (etaJ3) with a 1.45 kb ETA(exp)-deficient eta fragment (1.7 kb eta/pUC9) obtained from the 5.8 kb eta sequence by subcloning was not detected in either the culture supernatant or the ammonium-sulfate-precipitated culture supernatant fraction (167-fold concentrate of the culture supernatant) by immunoblotting or the single radial immunodiffusion test. A large amount of cETA was produced by the 1.7 kb eta sequence when it was linked to ETA(exp) amplified by PCR (1.7 kb eta-ETA(exp)/pUC9), regardless of the orientation of ETA(exp) insertion. Northern blot hybridization showed lower levels of the transcripts of the 1.7 kb eta sequence than of the 5.8 kb eta sequence. The rsETA prepared from an S. aureus transformant into which the recombinant plasmid 3.4 kb eta-ETA(exp)/pYT3 (pYT3-etaJ6) had been introduced was expressed at high levels in the culture supernatant fraction as shown by the latex agglutination test. However, the agglutination titre in the culture supernatant fraction of rsETA produced by the recombinant plasmid (1.7 kb eta/pYT3) containing the 1.7 kb eta sequence carrying the 1.4 kb ETA(exp)-deficient eta fragment (pYT3-etaJ3) was 2500-4000 times lower than that of pYT3-etaJ6. PMID:15073304

  19. Expanding the nitrogen regulatory protein superfamily: Homology detection at below random sequence identity.

    PubMed

    Kinch, Lisa N; Grishin, Nick V

    2002-07-01

    Nitrogen regulatory (PII) proteins are signal transduction molecules involved in controlling nitrogen metabolism in prokaryots. PII proteins integrate the signals of intracellular nitrogen and carbon status into the control of enzymes involved in nitrogen assimilation. Using elaborate sequence similarity detection schemes, we show that five clusters of orthologs (COGs) and several small divergent protein groups belong to the PII superfamily and predict their structure to be a (betaalphabeta)(2) ferredoxin-like fold. Proteins from the newly emerged PII superfamily are present in all major phylogenetic lineages. The PII homologs are quite diverse, with below random (as low as 1%) pairwise sequence identities between some members of distant groups. Despite this sequence diversity, evidence suggests that the different subfamilies retain the PII trimeric structure important for ligand-binding site formation and maintain a conservation of conservations at residue positions important for PII function. Because most of the orthologous groups within the PII superfamily are composed entirely of hypothetical proteins, our remote homology-based structure prediction provides the only information about them. Analogous to structural genomics efforts, such prediction gives clues to the biological roles of these proteins and allows us to hypothesize about locations of functional sites on model structures or rationalize about available experimental information. For instance, conserved residues in one of the families map in close proximity to each other on PII structure, allowing for a possible metal-binding site in the proteins coded by the locus known to affect sensitivity to divalent metal ions. Presented analysis pushes the limits of sequence similarity searches and exemplifies one of the extreme cases of reliable sequence-based structure prediction. In conjunction with structural genomics efforts to shed light on protein function, our strategies make it possible to detect

  20. Genomic Locations of Conserved Noncoding Sequences and Their Proximal Protein-Coding Genes in Mammalian Expression Dynamics.

    PubMed

    Babarinde, Isaac Adeyemi; Saitou, Naruya

    2016-07-01

    Experimental studies have found the involvement of certain conserved noncoding sequences (CNSs) in the regulation of the proximal protein-coding genes in mammals. However, reported cases of long range enhancer activities and inter-chromosomal regulation suggest that proximity of CNSs to protein-coding genes might not be important for regulation. To test the importance of the CNS genomic location, we extracted the CNSs conserved between chicken and four mammalian species (human, mouse, dog, and cattle). These CNSs were confirmed to be under purifying selection. The intergenic CNSs are often found in clusters in gene deserts, where protein-coding genes are in paucity. The distribution pattern, ChIP-Seq, and RNA-Seq data suggested that the CNSs are more likely to be regulatory elements and not corresponding to long intergenic noncoding RNAs. Physical distances between CNS and their nearest protein coding genes were well conserved between human and mouse genomes, and CNS-flanking genes were often found in evolutionarily conserved genomic neighborhoods. ChIP-Seq signal and gene expression patterns also suggested that CNSs regulate nearby genes. Interestingly, genes with more CNSs have more evolutionarily conserved expression than those with fewer CNSs. These computationally obtained results suggest that the genomic locations of CNSs are important for their regulatory functions. In fact, various kinds of evolutionary constraints may be acting to maintain the genomic locations of CNSs and protein-coding genes in mammals to ensure proper regulation. PMID:27017584

  1. Informational structure of genetic sequences and nature of gene splicing

    NASA Astrophysics Data System (ADS)

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  2. Statistical Inference and Reverse Engineering of Gene Regulatory Networks from Observational Expression Data

    PubMed Central

    Emmert-Streib, Frank; Glazko, Galina V.; Altay, Gökmen; de Matos Simoes, Ricardo

    2012-01-01

    In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms. PMID:22408642

  3. The acu-1 gene of Coprinus cinereus is a regulatory gene required for induction of acetate utilisation enzymes.

    PubMed

    Maconochie, M K; Connerton, I F; Casselton, L A

    1992-08-01

    We have isolated a gene from Coprinus cinereus which cross-hybridises to the facA and acu-5 genes of Aspergillus nidulans and Neurospora crassa, respectively. These genes encode acetyl-CoA synthetase, an enzyme which is inducible by acetate and required for growth on acetate as sole carbon source. We have designated the C. cinereus gene acs-1 and have used transformation to demonstrate its functional homology to the ascomycete genes by complementation of an N. crassa acu-5 mutation. The acs-1 gene has never been identified by mutation; mutations leading to loss of acetyl-CoA synthetase function map to another gene, acu-1. Using Northern analyses we have shown that acu-1 has a regulatory function that is required for acetate-induced transcription of acs-1 and of another acetate utilisation gene, acu-7, the isocitrate lyase structural gene. PMID:1354839

  4. Novel regulatory cascades controlling expression of nitrogen-fixation genes in Geobacter sulfurreducens

    PubMed Central

    Ueki, Toshiyuki; Lovley, Derek R.

    2010-01-01

    Geobacter species often play an important role in bioremediation of environments contaminated with metals or organics and show promise for harvesting electricity from waste organic matter in microbial fuel cells. The ability of Geobacter species to fix atmospheric nitrogen is an important metabolic feature for these applications. We identified novel regulatory cascades controlling nitrogen-fixation gene expression in Geobacter sulfurreducens. Unlike the regulatory mechanisms known in other nitrogen-fixing microorganisms, nitrogen-fixation gene regulation in G. sulfurreducens is controlled by two two-component His–Asp phosphorelay systems. One of these systems appears to be the master regulatory system that activates transcription of the majority of nitrogen-fixation genes and represses a gene encoding glutamate dehydrogenase during nitrogen fixation. The other system whose expression is directly activated by the master regulatory system appears to control by antitermination the expression of a subset of the nitrogen-fixation genes whose transcription is activated by the master regulatory system and whose promoter contains transcription termination signals. This study provides a new paradigm for nitrogen-fixation gene regulation. PMID:20660485

  5. Integrated microRNA-mRNA analyses reveal OPLL specific microRNA regulatory network using high-throughput sequencing.

    PubMed

    Xu, Chen; Chen, Yu; Zhang, Hao; Chen, Yuanyuan; Shen, Xiaolong; Shi, Changgui; Liu, Yang; Yuan, Wen

    2016-01-01

    Ossification of the posterior longitudinal ligament (OPLL) is a genetic disorder which involves pathological heterotopic ossification of the spinal ligaments. Although studies have identified several genes that correlated with OPLL, the underlying regulation network is far from clear. Through small RNA sequencing, we compared the microRNA expressions of primary posterior longitudinal ligament cells form OPLL patients with normal patients (PLL) and identified 218 dysregulated miRNAs (FDR < 0.01). Furthermore, assessing the miRNA profiling data of multiple cell types, we found these dysregulated miRNAs were mostly OPLL specific. In order to decipher the regulation network of these OPLL specific miRNAs, we integrated mRNA expression profiling data with miRNA sequencing data. Through computational approaches, we showed the pivotal roles of these OPLL specific miRNAs in heterotopic ossification of longitudinal ligament by discovering highly correlated miRNA/mRNA pairs that associated with skeletal system development, collagen fibril organization, and extracellular matrix organization. The results of which provide strong evidence that the miRNA regulatory networks we established may indeed play vital roles in OPLL onset and progression. To date, this is the first systematic analysis of the micronome in OPLL, and thus may provide valuable resources in finding novel treatment and diagnostic targets of OPLL. PMID:26868491

  6. LmSmdB: an integrated database for metabolic and gene regulatory network in Leishmania major and Schistosoma mansoni

    PubMed Central

    Patel, Priyanka; Mandlik, Vineetha; Singh, Shailza

    2015-01-01

    A database that integrates all the information required for biological processing is essential to be stored in one platform. We have attempted to create one such integrated database that can be a one stop shop for the essential features required to fetch valuable result. LmSmdB (L. major and S. mansoni database) is an integrated database that accounts for the biological networks and regulatory pathways computationally determined by integrating the knowledge of the genome sequences of the mentioned organisms. It is the first database of its kind that has together with the network designing showed the simulation pattern of the product. This database intends to create a comprehensive canopy for the regulation of lipid metabolism reaction in the parasite by integrating the transcription factors, regulatory genes and the protein products controlled by the transcription factors and hence operating the metabolism at genetic level. PMID:26981382

  7. Complete sequence and gene organization of the Nosema spodopterae rRNA gene.

    PubMed

    Tsai, Shu-Jen; Huang, Wei-Fone; Wang, Chung-Hsiung

    2005-01-01

    By sequencing the entire ribosomal RNA (rRNA) gene of Nosema spodopterae, we show here that its gene organization follows a pattern similar to the Nosema type species, Nosema bombycis, i.e. 5'-large subunit rRNA (2,497 bp)-internal transcribed spacer (185 bp)-small subunit rRNA (1,232 bp)-intergenic spacer (277 bp)-5S rRNA (114 bp)-3'. Gene sequences and the secondary structures of large subunit rRNA, small subunit rRNA, and 5S rRNA are compared with the known corresponding sequences and structures of closely related microsporidia. The results suggest that the Nosema genus may be heterogeneous and that the rRNA gene organization may be a useful characteristic for determining which species are closely related to the type species. PMID:15702980

  8. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    PubMed

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

    2016-01-01

    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting. PMID:27420958

  9. Gene Regulatory Network Inference of Immunoresponsive Gene 1 (IRG1) Identifies Interferon Regulatory Factor 1 (IRF1) as Its Transcriptional Regulator in Mammalian Macrophages

    PubMed Central

    Tallam, Aravind; Perumal, Thaneer M.; Antony, Paul M.; Jäger, Christian; Fritz, Joëlle V.; Vallar, Laurent; Balling, Rudi; del Sol, Antonio; Michelucci, Alessandro

    2016-01-01

    Immunoresponsive gene 1 (IRG1) is one of the highest induced genes in macrophages under pro-inflammatory conditions. Its function has been recently described: it codes for immune-responsive gene 1 protein/cis-aconitic acid decarboxylase (IRG1/CAD), an enzyme catalysing the production of itaconic acid from cis-aconitic acid, a tricarboxylic acid (TCA) cycle intermediate. Itaconic acid possesses specific antimicrobial properties inhibiting isocitrate lyase, the first enzyme of the glyoxylate shunt, an anaplerotic pathway that bypasses the TCA cycle and enables bacteria to survive on limited carbon conditions. To elucidate the mechanisms underlying itaconic acid production through IRG1 induction in macrophages, we examined the transcriptional regulation of IRG1. To this end, we studied IRG1 expression in human immune cells under different inflammatory stimuli, such as TNFα and IFNγ, in addition to lipopolysaccharides. Under these conditions, as previously shown in mouse macrophages, IRG1/CAD accumulates in mitochondria. Furthermore, using literature information and transcription factor prediction models, we re-constructed raw gene regulatory networks (GRNs) for IRG1 in mouse and human macrophages. We further implemented a contextualization algorithm that relies on genome-wide gene expression data to infer putative cell type-specific gene regulatory interactions in mouse and human macrophages, which allowed us to predict potential transcriptional regulators of IRG1. Among the computationally identified regulators, siRNA-mediated gene silencing of interferon regulatory factor 1 (IRF1) in macrophages significantly decreased the expression of IRG1/CAD at the gene and protein level, which correlated with a reduced production of itaconic acid. Using a synergistic approach of both computational and experimental methods, we here shed more light on the transcriptional machinery of IRG1 expression and could pave the way to therapeutic approaches targeting itaconic acid levels

  10. Gene Regulatory Network Inference of Immunoresponsive Gene 1 (IRG1) Identifies Interferon Regulatory Factor 1 (IRF1) as Its Transcriptional Regulator in Mammalian Macrophages.

    PubMed

    Tallam, Aravind; Perumal, Thaneer M; Antony, Paul M; Jäger, Christian; Fritz, Joëlle V; Vallar, Laurent; Balling, Rudi; Del Sol, Antonio; Michelucci, Alessandro

    2016-01-01

    Immunoresponsive gene 1 (IRG1) is one of the highest induced genes in macrophages under pro-inflammatory conditions. Its function has been recently described: it codes for immune-responsive gene 1 protein/cis-aconitic acid decarboxylase (IRG1/CAD), an enzyme catalysing the production of itaconic acid from cis-aconitic acid, a tricarboxylic acid (TCA) cycle intermediate. Itaconic acid possesses specific antimicrobial properties inhibiting isocitrate lyase, the first enzyme of the glyoxylate shunt, an anaplerotic pathway that bypasses the TCA cycle and enables bacteria to survive on limited carbon conditions. To elucidate the mechanisms underlying itaconic acid production through IRG1 induction in macrophages, we examined the transcriptional regulation of IRG1. To this end, we studied IRG1 expression in human immune cells under different inflammatory stimuli, such as TNFα and IFNγ, in addition to lipopolysaccharides. Under these conditions, as previously shown in mouse macrophages, IRG1/CAD accumulates in mitochondria. Furthermore, using literature information and transcription factor prediction models, we re-constructed raw gene regulatory networks (GRNs) for IRG1 in mouse and human macrophages. We further implemented a contextualization algorithm that relies on genome-wide gene expression data to infer putative cell type-specific gene regulatory interactions in mouse and human macrophages, which allowed us to predict potential transcriptional regulators of IRG1. Among the computationally identified regulators, siRNA-mediated gene silencing of interferon regulatory factor 1 (IRF1) in macrophages significantly decreased the expression of IRG1/CAD at the gene and protein level, which correlated with a reduced production of itaconic acid. Using a synergistic approach of both computational and experimental methods, we here shed more light on the transcriptional machinery of IRG1 expression and could pave the way to therapeutic approaches targeting itaconic acid levels

  11. Nucleotide sequence of Bacillus phage Nf terminal protein gene.

    PubMed Central

    Leavitt, M C; Ito, J

    1987-01-01

    The nucleotide sequence of Bacillus phage Nf gene E has been determined. Gene E codes for phage terminal protein which is the primer necessary for the initiation of DNA replication. The deduced amino acid sequence of Nf terminal protein is approximately 66% homologous with the terminal proteins of Bacillus phages PZA and luminal diameter 29, and shows similar hydropathy and secondary structure predictions. A serine which has been identified as the residue which covalently links the protein to the 5' end of the genome in luminal diameter 29, is conserved in all three phages. The hydropathic and secondary structural environment of this serine is similar in these phage terminal proteins and also similar to the linking serine of adenovirus terminal protein. PMID:3601672

  12. The nucleotide sequence of the gB glycoprotein gene of HSV-2 and comparison with the corresponding gene of HSV-1.

    PubMed

    Bzik, D J; D