Science.gov

Sample records for gene sequences regulatory

  1. Modeling DNA sequence-based cis-regulatory gene networks.

    PubMed

    Bolouri, Hamid; Davidson, Eric H

    2002-06-01

    Gene network analysis requires computationally based models which represent the functional architecture of regulatory interactions, and which provide directly testable predictions. The type of model that is useful is constrained by the particular features of developmentally active cis-regulatory systems. These systems function by processing diverse regulatory inputs, generating novel regulatory outputs. A computational model which explicitly accommodates this basic concept was developed earlier for the cis-regulatory system of the endo16 gene of the sea urchin. This model represents the genetically mandated logic functions that the system executes, but also shows how time-varying kinetic inputs are processed in different circumstances into particular kinetic outputs. The same basic design features can be utilized to construct models that connect the large number of cis-regulatory elements constituting developmental gene networks. The ultimate aim of the network models discussed here is to represent the regulatory relationships among the genomic control systems of the genes in the network, and to state their functional meaning. The target site sequences of the cis-regulatory elements of these genes constitute the physical basis of the network architecture. Useful models for developmental regulatory networks must represent the genetic logic by which the system operates, but must also be capable of explaining the real time dynamics of cis-regulatory response as kinetic input and output data become available. Most importantly, however, such models must display in a direct and transparent manner fundamental network design features such as intra- and intercellular feedback circuitry; the sources of parallel inputs into each cis-regulatory element; gene battery organization; and use of repressive spatial inputs in specification and boundary formation. Successful network models lead to direct tests of key architectural features by targeted cis-regulatory analysis. PMID

  2. Nucleotide sequence and temporal expression of a baculovirus regulatory gene.

    PubMed

    Guarino, L A; Summers, M D

    1987-07-01

    The nucleotide sequence of a trans-activating regulatory gene (IE-1) of the baculovirus Autographa californica nuclear polyhedrosis virus has been determined. This gene encodes a protein of 581 amino acids with a predicted molecular weight of 66,856. A DNA fragment containing the entire coding sequence of IE-1 was inserted downstream of an RNA promoter. Subsequent cell-free transcription and translation directed the synthesis of a single peptide with an apparent molecular weight of 70,000. Quantitative S1 nuclease analysis indicated that IE-1 was maximally synthesized during a 1-h virus adsorption period and that steady-state levels of IE-1 message were maintained during the first 24 h of infection. Northern blot hybridization indicated that several late transcripts which overlap the IE-1 gene were transcribed from both strands. The precise locations of the 5' and 3' ends of these overlapping transcripts were mapped using S1 nuclease. The overlapping transcripts were grouped in two transcriptional units. One unit was composed of IE-1 and overlapping gamma transcripts which initiated upstream of IE-1 and terminated downstream of IE-1. The other unit, transcribed from the opposite strand, consisted of gamma transcripts with coterminal 5' ends and extended 3' ends. The shorter, more abundant transcripts in this unit overlapped 30 to 40 bases of IE-1 at the 3' end, while the longer transcripts overlapped the entire IE-1 gene. Transcription of several early A. californica nuclear polyhedrosis virus genes, in addition to 39K, was shown to be trans-activated by IE-1, indicating that IE-1 may have a central role in the regulation of beta-gene expression. PMID:16789264

  3. Sequence-based model of gap gene regulatory network

    PubMed Central

    2014-01-01

    Background The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. Results We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. Conclusions The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are

  4. Analysis of mammalian gene batteries reveals both stable ancestral cores and highly dynamic regulatory sequences

    PubMed Central

    Ettwiller, Laurence; Budd, Aidan; Spitz, François; Wittbrodt, Joachim

    2008-01-01

    Background Changes in gene regulation are suspected to comprise one of the driving forces for evolution. To address the extent of cis-regulatory changes and how they impact on gene regulatory networks across eukaryotes, we systematically analyzed the evolutionary dynamics of target gene batteries controlled by 16 different transcription factors. Results We found that gene batteries show variable conservation within vertebrates, with slow and fast evolving modules. Hence, while a key gene battery associated with the cell cycle is conserved throughout metazoans, the POU5F1 (Oct4) and SOX2 batteries in embryonic stem cells show strong conservation within mammals, with the striking exception of rodents. Within the genes composing a given gene battery, we could identify a conserved core that likely reflects the ancestral function of the corresponding transcription factor. Interestingly, we show that the association between a transcription factor and its target genes is conserved even when we exclude conserved sequence similarities of their promoter regions from our analysis. This supports the idea that turnover, either of the transcription factor binding site or its direct neighboring sequence, is a pervasive feature of proximal regulatory sequences. Conclusions Our study reveals the dynamics of evolutionary changes within metazoan gene networks, including both the composition of gene batteries and the architecture of target gene promoters. This variation provides the playground required for evolutionary innovation around conserved ancestral core functions. PMID:19087242

  5. Multiple Cis-Acting Sequences Contribute to Evolved Regulatory Variation for Drosophila Adh Genes

    PubMed Central

    Fang, X. M.; Brennan, M. D.

    1992-01-01

    Drosophila affinidisjuncta and Drosophila hawaiiensis are closely related species that display distinct tissue-specific expression patterns for their homologous alcohol dehydrogenase genes (Adh genes). In Drosophila melanogaster transformants, both genes are expressed at high levels in the larval and adult fat bodies, but the D. affinidisjuncta gene is expressed 10-50-fold more strongly in the larval and adult midguts and Malpighian tubules. The present study reports the mapping of cis-acting sequences contributing to the regulatory differences between these two genes in transformants. Chimeric genes were constructed and introduced into the germ line of D. melanogaster. Stage- and tissue-specific expression patterns were determined by measuring steady-state RNA levels in larvae and adults. Three portions of the promoter region make distinct contributions to the tissue-specific regulatory differences between the native genes. Sequences immediately upstream of the distal promoter have a strong effect in the adult Malpighian tubules, while sequences between the two promoters are relatively important in the larval Malpighian tubules. A third gene segment, immediately upstream of the proximal promoter, influences levels of the proximal Adh transcript in all tissues and developmental stages examined, and largely accounts for the regulatory difference in the larval and adult midguts. However, these as well as other sequences make smaller contributions to various aspects of the tissue-specific regulatory differences. In addition, some chimeric genes display aberrant RNA levels for the whole organism, suggesting close physical association between sequences involved in tissue-specific regulatory differences and those important for Adh expression in the larval and adult fat bodies. PMID:1644276

  6. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  7. Transposable elements: an abundant and natural source of regulatory sequences for host genes.

    PubMed

    Rebollo, Rita; Romanish, Mark T; Mager, Dixie L

    2012-01-01

    The fact that transposable elements (TEs) can influence host gene expression was first recognized more than 50 years ago. However, since that time, TEs have been widely regarded as harmful genetic parasites-selfish elements that are rarely co-opted by the genome to serve a beneficial role. Here, we survey recent findings that relate to TE impact on host genes and remind the reader that TEs, in contrast to other noncoding parts of the genome, are uniquely suited to gene regulatory functions. We review recent studies that demonstrate the role of TEs in establishing and rewiring gene regulatory networks and discuss the overall ubiquity of exaptation. We suggest that although individuals within a population can be harmed by the deleterious effects of new TE insertions, the presence of TE sequences in a genome is of overall benefit to the population. PMID:22905872

  8. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  9. The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

    PubMed Central

    Portales-Casamar, Elodie; Arenillas, David; Lim, Jonathan; Swanson, Magdalena I.; Jiang, Steven; McCallum, Anthony; Kirov, Stefan; Wasserman, Wyeth W.

    2009-01-01

    The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk. PMID:18971253

  10. Regulatory sequence analysis tools.

    PubMed

    van Helden, Jacques

    2003-07-01

    The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.

  11. Conserved regulatory elements of the promoter sequence of the gene rpoH of enteric bacteria

    PubMed Central

    Ramírez-Santos, Jesús; Collado-Vides, Julio; García-Varela, Martin; Gómez-Eichelmann, M. Carmen

    2001-01-01

    The rpoH regulatory region of different members of the enteric bacteria family was sequenced or downloaded from GenBank and compared. In addition, the transcriptional start sites of rpoH of Yersinia frederiksenii and Proteus mirabilis, two distant members of this family, were determined. Sequences similar to the σ70 promoters P1, P4 and P5, to the σE promoter P3 and to boxes DnaA1, DnaA2, cAMP receptor protein (CRP) boxes CRP1, CRP2 and box CytR present in Escherichia coli K12, were identified in sequences of closely related bacteria such as: E.coli, Shigella flexneri, Salmonella enterica serovar Typhimurium, Citrobacter freundii, Enterobacter cloacae and Klebsiella pneumoniae. In more distant bacteria, Y.frederiksenii and P.mirabilis, the rpoH regulatory region has a distal P1-like σ70 promoter and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. Sequences similar to the regulatory boxes were not identified in these bacteria. This study suggests that the general pattern of transcription of the rpoH gene in enteric bacteria includes a distal σ70 promoter, >200 nt upstream of the initiation codon, and two proximal promoters: a heat-induced σE-like promoter and a σ70 promoter. A second proximal σ70 promoter under catabolite-regulation is probably present only in bacteria closely related to E.coli. PMID:11139607

  12. Coordinate cytokine regulatory sequences

    DOEpatents

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  13. Sequence analysis of the myosin regulatory light chain gene of the vestimentiferan Riftia pachyptila.

    PubMed

    Ravaux, J; Hassanin, A; Deutsch, J; Gaill, F; Markmann-Mulisch, U

    2001-01-24

    We have isolated and characterized a cDNA (DNA complementary to RNA) clone (Rf69) from the vestimentiferan Riftia pachyptila. The cDNA insert consists of 1169 base pairs. The aminoacid sequence deduced from the longest reading frame is 193 residues in length, and clearly characterized it as a myosin regulatory light chain (RLC). The RLC primary structure is described in relation to its function in muscle contraction. The comparison with other RLCs suggested that Riftia myosin is probably regulated through its RLC either by phosphorylation like the vertebrate smooth muscle myosins, and/or by Ca2+-binding like the mollusk myosins. Riftia RLC possesses a N-terminal extension lacking in all other species besides the earthworm Lumbricus terrestris. Aminoacid sequence comparisons with a number of RLCs from vertebrates and invertebrates revealed a relatively high identity score (64%) between Riftia RLC and the homologous gene from Lumbricus. The relationships between the members of the myosin RLCs were examined by two phylogenetic methods, i.e. distance matrix and maximum parsimony. The resulting trees depict the grouping of the RLCs according to their role in myosin activity regulation. In all trees, Riftia RLC groups with RLCs that depend on Ca2+-binding for myosin activity regulation. PMID:11223252

  14. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes.

    PubMed

    Pavesi, Giulio; Mereghetti, Paolo; Zambelli, Federico; Stefani, Marco; Mauri, Giancarlo; Pesole, Graziano

    2006-07-01

    Understanding the complex mechanisms regulating gene expression at the transcriptional and post-transcriptional levels is one of the greatest challenges of the post-genomic era. The MoD (MOtif Discovery) Tools web server comprises a set of tools for the discovery of novel conserved sequence and structure motifs in nucleotide sequences, motifs that in turn are good candidates for regulatory activity. The server includes the following programs: Weeder, for the discovery of conserved transcription factor binding sites (TFBSs) in nucleotide sequences from co-regulated genes; WeederH, for the discovery of conserved TFBSs and distal regulatory modules in sequences from homologous genes; RNAProfile, for the discovery of conserved secondary structure motifs in unaligned RNA sequences whose secondary structure is not known. In this way, a given gene can be compared with other co-regulated genes or with its homologs, or its mRNA can be analyzed for conserved motifs regulating its post-transcriptional fate. The web server thus provides researchers with different strategies and methods to investigate the regulation of gene expression, at both the transcriptional and post-transcriptional levels. Available at http://www.pesolelab.it/modtools/ and http://www.beacon.unimi.it/modtools/.

  15. Different regulatory sequences control creatine kinase-M gene expression in directly injected skeletal and cardiac muscle.

    PubMed Central

    Vincent, C K; Gualberto, A; Patel, C V; Walsh, K

    1993-01-01

    Regulatory sequences of the M isozyme of the creatine kinase (MCK) gene have been extensively mapped in skeletal muscle, but little is known about the sequences that control cardiac-specific expression. The promoter and enhancer sequences required for MCK gene expression were assayed by the direct injection of plasmid DNA constructs into adult rat cardiac and skeletal muscle. A 700-nucleotide fragment containing the enhancer and promoter of the rabbit MCK gene activated the expression of a downstream reporter gene in both muscle tissues. Deletion of the enhancer significantly decreased expression in skeletal muscle but had no detectable effect on expression in cardiac muscle. Further deletions revealed a CArG sequence motif at position -179 within the promoter that was essential for cardiac-specific expression. The CArG element of the MCK promoter bound to the recombinant serum response factor and YY1, transcription factors which control expression from structurally similar elements in the skeletal actin and c-fos promoters. MCK-CArG-binding activities that were similar or identical to serum response factor and YY1 were also detected in extracts from adult cardiac muscle. These data suggest that the MCK gene is controlled by different regulatory programs in adult cardiac and skeletal muscle. Images PMID:8423791

  16. Using BAC transgenesis in zebrafish to identify regulatory sequences of the amyloid precursor protein gene in humans

    PubMed Central

    2012-01-01

    Background Non-coding DNA in and around the human Amyloid Precursor Protein (APP) gene that is central to Alzheimer’s disease (AD) shares little sequence similarity with that of appb in zebrafish. Identifying DNA domains regulating expression of the gene in such situations becomes a challenge. Taking advantage of the zebrafish system that allows rapid functional analyses of gene regulatory sequences, we previously showed that two discontinuous DNA domains in zebrafish appb are important for expression of the gene in neurons: an enhancer in intron 1 and sequences 28–31 kb upstream of the gene. Here we identify the putative transcription factor binding sites responsible for this distal cis-acting regulation, and use that information to identify a regulatory region of the human APP gene. Results Functional analyses of intron 1 enhancer mutations in enhancer-trap BACs expressed as transgenes in zebrafish identified putative binding sites of two known transcription factor proteins, E4BP4/ NFIL3 and Forkhead, to be required for expression of appb. A cluster of three E4BP4 sites at −31 kb is also shown to be essential for neuron-specific expression, suggesting that the dependence of expression on upstream sequences is mediated by these E4BP4 sites. E4BP4/ NFIL3 and XFD1 sites in the intron enhancer and E4BP4/ NFIL3 sites at −31 kb specifically and efficiently bind the corresponding zebrafish proteins in vitro. These sites are statistically over-represented in both the zebrafish appb and the human APP genes, although their locations are different. Remarkably, a cluster of four E4BP4 sites in intron 4 of human APP exists in actively transcribing chromatin in a human neuroblastoma cell-line, SHSY5Y, expressing APP as shown using chromatin immunoprecipitation (ChIP) experiments. Thus although the two genes share little sequence conservation, they appear to share the same regulatory logic and are regulated by a similar set of transcription factors. Conclusion The

  17. A Catalog of Regulatory Sequences for Trait Gene for the Genome Editing of Wheat

    PubMed Central

    Makai, Szabolcs; Tamás, László; Juhász, Angéla

    2016-01-01

    Wheat has been cultivated for 10000 years and ever since the origin of hexaploid wheat it has been exempt from natural selection. Instead, it was under the constant selective pressure of human agriculture from harvest to sowing during every year, producing a vast array of varieties. Wheat has been adopted globally, accumulating variation for genes involved in yield traits, environmental adaptation and resistance. However, one small but important part of the wheat genome has hardly changed: the regulatory regions of both the x- and y-type high molecular weight glutenin subunit (HMW-GS) genes, which are alone responsible for approximately 12% of the grain protein content. The phylogeny of the HMW-GS regulatory regions of the Triticeae demonstrates that a genetic bottleneck may have led to its decreased diversity during domestication and the subsequent cultivation. It has also highlighted the fact that the wild relatives of wheat may offer an unexploited genetic resource for the regulatory region of these genes. Significant research efforts have been made in the public sector and by international agencies, using wild crosses to exploit the available genetic variation, and as a result synthetic hexaploids are now being utilized by a number of breeding companies. However, a newly emerging tool of genome editing provides significantly improved efficiency in exploiting the natural variation in HMW-GS genes and incorporating this into elite cultivars and breeding lines. Recent advancement in the understanding of the regulation of these genes underlines the needs for an overview of the regulatory elements for genome editing purposes. PMID:27766102

  18. Nucleotide sequence of the regulatory locus controlling expression of bacterial genes for bioluminescence.

    PubMed Central

    Engebrecht, J; Silverman, M

    1987-01-01

    Production of light by the marine bacterium Vibrio fischeri and by recombinant hosts containing cloned lux genes is controlled by the density of the culture. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes and two closely linked divergent promoters. As part of a genetic analysis to understand the regulation of bioluminescence, we have sequenced the region of DNA containing this control circuit. Open reading frames corresponding to luxR and luxI were identified; transcription start sites were defined by S1 nuclease mapping and sequences resembling promoter elements were located. Images PMID:3697093

  19. Targeted genomic sequencing of follicular dendritic cell sarcoma reveals recurrent alterations in NF-κB regulatory genes.

    PubMed

    Griffin, Gabriel K; Sholl, Lynette M; Lindeman, Neal I; Fletcher, Christopher D M; Hornick, Jason L

    2016-01-01

    Follicular dendritic cell sarcoma is a rare mesenchymal neoplasm with a variable and unpredictable clinical course. The genetic alterations that drive tumorigenesis in follicular dendritic cell sarcoma are largely unknown. One recent study performed BRAF sequencing and found V600E mutations in 5 of 27 (19%) cases. No other recurrent genetic alterations have been reported. The aim of the present study was to identify somatic alterations in follicular dendritic cell sarcoma by targeted sequencing of a panel of 309 known cancer-associated genes. DNA was isolated from formalin-fixed paraffin-embedded tissue from 13 cases of follicular dendritic cell sarcoma and submitted for hybrid capture-based enrichment and massively parallel sequencing with the Illumina HiSeq 2500 platform. Recurrent loss-of-function alterations were observed in tumor suppressor genes involved in the negative regulation of NF-κB activation (5 of 13 cases, 38%) and cell cycle progression (4 of 13 cases, 31%). Loss-of-function alterations in the NF-κB regulatory pathway included three cases with frameshift mutations in NFKBIA and two cases with bi-allelic loss of CYLD. Both cases with CYLD loss were metastases and carried concurrent alterations in at least one cell cycle regulatory gene. Alterations in cell cycle regulatory genes included two cases with bi-allelic loss of CDKN2A, one case with bi-allelic loss of RB1, and one case with a nonsense mutation in RB1. Last, focal copy-number gain of chromosome 9p24 including the genes CD274 (PD-L1) and PDCD1LG2 (PD-L2) was noted in three cases, which represents a well-described mechanism of immune evasion in cancer. These findings provide the first insight into the unique genomic landscape of follicular dendritic cell sarcoma and suggest shared mechanisms of tumorigenesis with a subset of other tumor types, notably B-cell lymphomas.

  20. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  1. Cloning and Characterization of 5′ Flanking Regulatory Sequences of AhLEC1B Gene from Arachis Hypogaea L.

    PubMed Central

    Tang, Guiying; Xu, Pingli; Liu, Wei; Liu, Zhanji; Shan, Lei

    2015-01-01

    LEAFY COTYLEDON1 (LEC1) is a B subunit of Nuclear Factor Y (NF-YB) transcription factor that mainly accumulates during embryo development. We cloned the 5′ flanking regulatory sequence of AhLEC1B gene, a homolog of Arabidopsis LEC1, and analyzed its regulatory elements using online software. To identify the crucial regulatory region, we generated a series of GUS expression frameworks driven by different length promoters with 5′ terminal and/or 3′ terminal deletion. We further characterized the GUS expression patterns in the transgenic Arabidopsis lines. Our results show that both the 65bp proximal promoter region and the 52bp 5′ UTR of AhLEC1B contain the key motifs required for the essential promoting activity. Moreover, AhLEC1B is preferentially expressed in the embryo and is co-regulated by binding of its upstream genes with both positive and negative corresponding cis-regulatory elements. PMID:26426444

  2. Population sequencing of two endocannabinoid metabolic genes identifies rare and common regulatory variants associated with extreme obesity and metabolite level

    PubMed Central

    2010-01-01

    Background Targeted re-sequencing of candidate genes in individuals at the extremes of a quantitative phenotype distribution is a method of choice to gain information on the contribution of rare variants to disease susceptibility. The endocannabinoid system mediates signaling in the brain and peripheral tissues involved in the regulation of energy balance, is highly active in obese patients, and represents a strong candidate pathway to examine for genetic association with body mass index (BMI). Results We sequenced two intervals (covering 188 kb) encoding the endocannabinoid metabolic enzymes fatty-acid amide hydrolase (FAAH) and monoglyceride lipase (MGLL) in 147 normal controls and 142 extremely obese cases. After applying quality filters, we called 1,393 high quality single nucleotide variants, 55% of which are rare, and 143 indels. Using single marker tests and collapsed marker tests, we identified four intervals associated with BMI: the FAAH promoter, the MGLL promoter, MGLL intron 2, and MGLL intron 3. Two of these intervals are composed of rare variants and the majority of the associated variants are located in promoter sequences or in predicted transcriptional enhancers, suggesting a regulatory role. The set of rare variants in the FAAH promoter associated with BMI is also associated with increased level of FAAH substrate anandamide, further implicating a functional role in obesity. Conclusions Our study, which is one of the first reports of a sequence-based association study using next-generation sequencing of candidate genes, provides insights into study design and analysis approaches and demonstrates the importance of examining regulatory elements rather than exclusively focusing on exon sequences. PMID:21118518

  3. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.

  4. Sox2 regulatory region 2 sequence works as a DNA nuclear targeting sequence enhancing the efficiency of an exogenous gene expression in ES cells

    SciTech Connect

    Funabashi, Hisakage; Takatsu, Makoto; Saito, Mikako; Matsuoka, Hideaki

    2010-10-01

    Research highlights: {yields} SV40-DTS worked as a DTS in ES cells as well as other types of cells. {yields} Sox2 regulatory region 2 worked as a DTS in ES cells and thus was termed as SRR2-DTS. {yields} SRR2-DTS was suggested as an ES cell-specific DTS. -- Abstract: In this report, the effects of two DNA nuclear targeting sequence (DTS) candidates on the gene expression efficiency in ES cells were investigated. Reporter plasmids containing the simian virus 40 (SV40) promoter/enhancer sequence (SV40-DTS), a DTS for various types of cells but not being reported yet for ES cells, and the 81 base pairs of Sox2 regulatory region 2 (SRR2) where two transcriptional factors in ES cells, Oct3/4 and Sox2, are bound (SRR2-DTS), were introduced into cytoplasm in living cells by femtoinjection. The gene expression efficiencies of each plasmid in mouse insulinoma cell line MIN6 cells and mouse ES cells were then evaluated. Plasmids including SV40-DTS and SRR2-DTS exhibited higher gene expression efficiency comparing to plasmids without these DTSs, and thus it was concluded that both sequences work as a DTS in ES cells. In addition, it was suggested that SRR2-DTS works as an ES cell-specific DTS. To the best of our knowledge, this is the first report to confirm the function of DTSs in ES cells.

  5. Evolutionary origin of a novel gene expression pattern through co-option of the latent activities of existing regulatory sequences.

    PubMed

    Rebeiz, Mark; Jikomes, Nick; Kassner, Victoria A; Carroll, Sean B

    2011-06-21

    Spatiotemporal changes in gene expression underlie many evolutionary novelties in nature. However, the evolutionary origins of novel expression patterns, and the transcriptional control elements ("enhancers") that govern them, remain unclear. Here, we sought to explore the molecular genetic mechanisms by which new enhancers arise. We undertook a survey of closely related Drosophila species to identify recently evolved novel gene expression patterns and traced their evolutionary history. Analyses of gene expression in a variety of developing tissues of the Drosophila melanogaster species subgroup revealed high rates of expression pattern divergence, including numerous evolutionary losses, heterochronic shifts, and expansions or contractions of expression domains. However, gains of novel expression patterns were much less frequent. One gain was observed for the Neprilysin-1 (Nep1) gene, which has evolved a unique expression pattern in optic lobe neuroblasts of Drosophila santomea. Dissection of the Nep1 cis-regulatory region localized a newly derived optic lobe enhancer activity to a region of an intron that has accumulated a small number of mutations. The Nep1 optic lobe enhancer overlaps with other enhancer activities, from which the novel activity was co-opted. We suggest that the novel optic lobe enhancer evolved by exploiting the cryptic activity of extant regulatory sequences, and this may reflect a general mechanism whereby new enhancers evolve.

  6. Nucleotide sequence and cloning in Bacillus subtilis of the Bacillus stearothermophilus pleiotropic regulatory gene degT.

    PubMed Central

    Takagi, M; Takada, H; Imanaka, T

    1990-01-01

    The regulatory gene (degT) from Bacillus stearothermophilus NCA1503 which enhanced production of extracellular alkaline protease (Apr) was cloned in Bacillus subtilis with pTB53 as a vector. When B. subtilis MT-2 (Npr- [deficiency of neutral protease] Apr+) was transformed with the recombinant plasmid, pDT145, the plasmid carrier produced about three times more alkaline protease than did the wild-type strain. In contrast, when B. subtilis DB104 (Npr- Apr-) was used as a host, the transformant with pDT145 could not exhibit any protease activity. After construction of the deletion plasmids, DNA sequencing was done. A large open reading frame was found, and nucleotide sequence analysis showed that the degT gene was composed of 1,116 bases (372 amino acid residues, molecular weight of 41,244). A Shine-Dalgarno sequence was found nine bases upstream from the open reading frame. A B. subtilis strain carrying degT showed the following pleiotropic phenomena: (i) enhancement of production of extracellular enzymes such as alkaline protease and levansucrase, (ii) repression of autolysin activity, (iii) decrease of transformation efficiency for B. subtilis (competent cell procedure), (iv) altered control of sporulation, (v) loss of flagella, and (vi) abnormal cell division. When B. stearothermophilus SIC1 was transformed with the recombinant plasmid carrying degT, the transformants exhibited abnormal cell division. These phenomena are similar to those of the phenotypes of degSU(Hy) (hyperproduction), degQ(Hy), and degR mutants of B. subtilis. However, the amino acid sequence of the degT product (DegT) is different from those of the reported gene products. Furthermore, DegT includes a hydrophobic core region in the N-terminal portion (amino acid numbers 50 to 160), a consensus sequence for a DNA binding region (amino acid numbers 160 to 179), and a region homologous to transcription activator proteins (amino acid numbers 351 to 366). We discuss the possibility that the membrane

  7. Regulation of the germinal center gene program by interferon (IFN) regulatory factor 8/IFN consensus sequence-binding protein

    PubMed Central

    Lee, Chang Hoon; Melchers, Mark; Wang, Hongsheng; Torrey, Ted A.; Slota, Rebecca; Qi, Chen-Feng; Kim, Ji Young; Lugar, Patricia; Kong, Hee Jeong; Farrington, Lila; van der Zouwen, Boris; Zhou, Jeff X.; Lougaris, Vassilios; Lipsky, Peter E.; Grammer, Amrie C.; Morse, Herbert C.

    2006-01-01

    Interferon (IFN) consensus sequence-binding protein/IFN regulatory factor 8 (IRF8) is a transcription factor that regulates the differentiation and function of macrophages, granulocytes, and dendritic cells through activation or repression of target genes. Although IRF8 is also expressed in lymphocytes, its roles in B cell and T cell maturation or function are ill defined, and few transcriptional targets are known. Gene expression profiling of human tonsillar B cells and mouse B cell lymphomas showed that IRF8 transcripts were expressed at highest levels in centroblasts, either from secondary lymphoid tissue or transformed cells. In addition, staining for IRF8 was most intense in tonsillar germinal center (GC) dark-zone centroblasts. To discover B cell genes regulated by IRF8, we transfected purified primary tonsillar B cells with enhanced green fluorescent protein–tagged IRF8, generated small interfering RNA knockdowns of IRF8 expression in a mouse B cell lymphoma cell line, and examined the effects of a null mutation of IRF8 on B cells. Each approach identified activation-induced cytidine deaminase (AICDA) and BCL6 as targets of transcriptional activation. Chromatin immunoprecipitation studies demonstrated in vivo occupancy of 5′ sequences of both genes by IRF8 protein. These results suggest previously unappreciated roles for IRF8 in the transcriptional regulation of B cell GC reactions that include direct regulation of AICDA and BCL6. PMID:16380510

  8. Regulation of the germinal center gene program by interferon (IFN) regulatory factor 8/IFN consensus sequence-binding protein.

    PubMed

    Lee, Chang Hoon; Melchers, Mark; Wang, Hongsheng; Torrey, Ted A; Slota, Rebecca; Qi, Chen-Feng; Kim, Ji Young; Lugar, Patricia; Kong, Hee Jeong; Farrington, Lila; van der Zouwen, Boris; Zhou, Jeff X; Lougaris, Vassilios; Lipsky, Peter E; Grammer, Amrie C; Morse, Herbert C

    2006-01-23

    Interferon (IFN) consensus sequence-binding protein/IFN regulatory factor 8 (IRF8) is a transcription factor that regulates the differentiation and function of macrophages, granulocytes, and dendritic cells through activation or repression of target genes. Although IRF8 is also expressed in lymphocytes, its roles in B cell and T cell maturation or function are ill defined, and few transcriptional targets are known. Gene expression profiling of human tonsillar B cells and mouse B cell lymphomas showed that IRF8 transcripts were expressed at highest levels in centroblasts, either from secondary lymphoid tissue or transformed cells. In addition, staining for IRF8 was most intense in tonsillar germinal center (GC) dark-zone centroblasts. To discover B cell genes regulated by IRF8, we transfected purified primary tonsillar B cells with enhanced green fluorescent protein-tagged IRF8, generated small interfering RNA knockdowns of IRF8 expression in a mouse B cell lymphoma cell line, and examined the effects of a null mutation of IRF8 on B cells. Each approach identified activation-induced cytidine deaminase (AICDA) and BCL6 as targets of transcriptional activation. Chromatin immunoprecipitation studies demonstrated in vivo occupancy of 5' sequences of both genes by IRF8 protein. These results suggest previously unappreciated roles for IRF8 in the transcriptional regulation of B cell GC reactions that include direct regulation of AICDA and BCL6.

  9. Regulatory function of conserved sequences upstream of the long-wave sensitive opsin genes in teleost fishes.

    PubMed

    Tam, Kevin J; Watson, Corey T; Massah, Shabnam; Kolybaba, Addie M; Breden, Felix; Prefontaine, Gratien G; Beischlag, Timothy V

    2011-11-01

    Vertebrate opsin genes often occur in sets of tandem duplicates, and their expression varies developmentally and in response to environmental cues. We previously identified two highly conserved regions upstream of the long-wave sensitive opsin (LWS) gene cluster in teleosts. This region has since been shown in zebrafish to drive expression of LWS genes in vivo. In order to further investigate how elements in this region control opsin gene expression, we tested constructs encompassing the highly conserved regions and the less conserved portions upstream of the coding sequences in a promoter-less luciferase expression system. A ∼4500 bp construct of the upstream region, including the highly-conserved regions Reg I and Reg II, increased expression 100-fold, and successive 5' deletions reduced expression relative to the full 4.5 Kb region. Gene expression was highest when the transcription factor RORα was co-transfected with the proposed regulatory regions. Because these regions were tested in a promoter-less expression system, they include elements able to initiate and drive transcription. Teleosts exhibit complex color-mediated adaptive behavior and their adaptive significance has been well documented in several species. Therefore these upstream regions of LWS represent a model system for understanding the molecular basis of adaptive variation in gene regulation of color vision.

  10. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. PMID:26025428

  11. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  12. Definition of regulatory sequence elements in the promoter region and the first intron of the myotonic dystrophy protein kinase gene.

    PubMed

    Storbeck, C J; Sabourin, L A; Waring, J D; Korneluk, R G

    1998-04-10

    Myotonic dystrophy is the most common inherited adult neuromuscular disorder with a global frequency of 1/8000. The genetic defect is an expanding CTG trinucleotide repeat in the 3'-untranslated region of the myotonic dystrophy protein kinase gene. We present the in vitro characterization of cis regulatory elements controlling transcription of the myotonic dystrophy protein kinase gene in myoblasts and fibroblasts. The region 5' to the initiating ATG contains no consensus TATA or CCAAT box. We have mapped two transcriptional start sites by primer extension. Deletion constructs from this region fused to the bacterial chloramphenicol acetyltransferase reporter gene revealed only subtle muscle specific cis elements. The strongest promoter activity mapped to a 189-base pair fragment. This sequence contains a conserved GC box to which the transcription factor Sp1 binds. Reporter gene constructs containing a 2-kilobase pair first intron fragment of the myotonic dystrophy protein kinase gene enhances reporter activity up to 6-fold in the human rhabdomyosarcoma myoblast cell line TE32 but not in NIH 3T3 fibroblasts. Co-transfection of a MyoD expression vector with reporter constructs containing the first intron into 10 T1/2 fibroblasts resulted in a 10-20-fold enhancement of expression. Deletion analysis of four E-box elements within the first intron reveal that these elements contribute to enhancer activity similarly in TE32 myoblasts and 10 T1/2 fibroblasts. These data suggest that E-boxes within the myotonic dystrophy protein kinase first intron mediate interactions with upstream promoter elements to up-regulate transcription of this gene in myoblasts.

  13. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model

    PubMed Central

    2014-01-01

    Background Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. Methods We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. Results WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and

  14. Plant nitrogen regulatory P-PII genes

    DOEpatents

    Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun

    2001-01-01

    The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the

  15. Ubiquitous and gene-specific regulatory 5' sequences in a sea urchin histone DNA clone coding for histone protein variants.

    PubMed Central

    Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L

    1980-01-01

    The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547

  16. Evolution of cis-regulatory sequence and function in Diptera.

    PubMed

    Wittkopp, P J

    2006-09-01

    Cis-regulatory sequences direct patterns of gene expression essential for development and physiology. Evolutionary changes in these sequences contribute to phenotypic divergence. Despite their importance, cis-regulatory regions remain one of the most enigmatic features of the genome. Patterns of sequence evolution can be used to identify cis-regulatory elements, but the power of this approach depends upon the relationship between sequence and function. Comparative studies of gene regulation among Diptera reveal that divergent sequences can underlie conserved expression, and that expression differences can evolve despite largely similar sequences. This complex structure-function relationship is the primary impediment for computational identification and interpretation of cis-regulatory sequences. Biochemical characterization and in vivo assays of cis-regulatory sequences on a genomic-scale will relieve this barrier.

  17. Co-ordinate expression of the two threonyl-tRNA synthetase genes in Bacillus subtilis: control by transcriptional antitermination involving a conserved regulatory sequence.

    PubMed Central

    Putzer, H; Gendron, N; Grunberg-Manago, M

    1992-01-01

    In Bacillus subtilis, two genes, thrS and thrZ, encode distinct threonyl-tRNA synthetase enzymes. Normally, only the thrS gene is expressed. Here we show that either gene, thrS or thrZ, is sufficient for normal cell growth and sporulation. Reducing the intracellular ThrS protein concentration induces thrZ expression in a dose-compensatory manner. Starvation for threonine simultaneously induces thrZ and stimulates thrS expression. The 5'-leader sequences of thrS and thrZ contain, respectively, one and three transcription terminators preceded by a conserved sequence. We show that this sequence is essential for the regulation of thrS via a transcriptional antitermination mechanism. We propose that both genes, thrS and thrZ, are regulated by the same mechanism such that the additional regulatory domains present before thrZ account for its non-expression. In contrast to Escherichia coli, structurally similar regulatory domains, i.e. the consensus sequence preceding a terminator structure, are found in the leader regions of most aminoacyl-tRNA synthetase genes of Gram-positive bacteria. This suggests that they are regulated by a common mechanism. Images PMID:1379177

  18. Creation of cis-regulatory elements during sea urchin evolution by co-option and optimization of a repetitive sequence adjacent to the spec2a gene.

    PubMed

    Dayal, Sandeep; Kiyama, Takae; Villinski, Jeffrey T; Zhang, Ning; Liang, Shuguang; Klein, William H

    2004-09-15

    The creation, preservation, and degeneration of cis-regulatory elements controlling developmental gene expression are fundamental genome-level evolutionary processes about which little is known. Here, we identify critical differences in cis-regulatory elements controlling the expression of the sea urchin aboral ectoderm-specific spec genes. We found multiple copies of a repetitive sequence element termed RSR in genomes of species within the Strongylocentrotidae family, but RSRs were not detected in genomes of species outside Strongylocentrotidae. spec genes in Strongylocentrotus purpuratus are invariably associated with RSRs, and the spec2a RSR functioned as a transcriptional enhancer and displayed greater activity than did spec1 or spec2c RSRs. Single-base pair differences at two cis-regulatory elements within the spec2a RSR increased the binding affinities of four transcription factors, SpCCAAT-binding factor at one element and SpOtx, SpGoosecoid, and SpGATA-E at another. The cis-regulatory elements to which these four factors bound were recent evolutionary acquisitions that acted to either activate or repress transcription, depending on the cell type. These elements were found in the spec2a RSR ortholog in Strongylocentrotus pallidus but not in RSR orthologs of Strongylocentrotus droebachiensis or Hemicentrotus pulcherrimus. Our results indicated that a dynamic pattern of cis-regulatory element evolution exists for spec genes despite their conserved aboral ectoderm expression.

  19. Epithelial and endothelial expression of the green fluorescent protein reporter gene under the control of bovine prion protein (PrP) gene regulatory sequences in transgenic mice

    NASA Astrophysics Data System (ADS)

    Lemaire-Vieille, Catherine; Schulze, Tobias; Podevin-Dimster, Valérie; Follet, Jérome; Bailly, Yannick; Blanquet-Grossard, Françoise; Decavel, Jean-Pierre; Heinen, Ernst; Cesbron, Jean-Yves

    2000-05-01

    The expression of the cellular form of the prion protein (PrPc) gene is required for prion replication and neuroinvasion in transmissible spongiform encephalopathies. The identification of the cell types expressing PrPc is necessary to understanding how the agent replicates and spreads from peripheral sites to the central nervous system. To determine the nature of the cell types expressing PrPc, a green fluorescent protein reporter gene was expressed in transgenic mice under the control of 6.9 kb of the bovine PrP gene regulatory sequences. It was shown that the bovine PrP gene is expressed as two populations of mRNA differing by alternative splicing of one 115-bp 5' untranslated exon in 17 different bovine tissues. The analysis of transgenic mice showed reporter gene expression in some cells that have been identified as expressing PrP, such as cerebellar Purkinje cells, lymphocytes, and keratinocytes. In addition, expression of green fluorescent protein was observed in the plexus of the enteric nervous system and in a restricted subset of cells not yet clearly identified as expressing PrP: the epithelial cells of the thymic medullary and the endothelial cells of both the mucosal capillaries of the intestine and the renal capillaries. These data provide valuable information on the distribution of PrPc at the cellular level and argue for roles of the epithelial and endothelial cells in the spread of infection from the periphery to the brain. Moreover, the transgenic mice described in this paper provide a model that will allow for the study of the transcriptional activity of the PrP gene promoter in response to scrapie infection.

  20. [Mosaic expression of the lacZ reporter-gene under control of 5'-regulatory sequences of the alpha-S1-casein gene in transgenic mice].

    PubMed

    Serova, I A; Andreeva, L E; Khaĭdarova, N V; Dias, L P; Dvorianchikov, G A; Burkov, I A; Baginskaia, N V

    2009-01-01

    Phenomenon of mosaic expression at cellular level is widely presented in tissues and organs of transgenic animals. The communication is concerned a study of the mosaics in transgenic mice carrying the lacZ reporter-gene under control of the bovine and goat alpha-S1-casein genes with 5'-flanked sequences of various ex-tent: pCLZ1--721bp, pCLZ2-- 2001 bp and pCLZ3 3409 bp constructs. Five transgenic founders were generated by injection of the recombinant DNA into zygotes: pCLZ 1 - N 16, pCLZ2 - N 37 and pCLZ3 N 7, N 36, and N 48. Positive for J3-galactosidase activity cells were detected in lactating mammary glands of all transgenic females, however, distribution of the positive cells was variable. We observed two types of mosaics: clonal or "lobule" type with positive cells filling the whole of the globule or stochastic type with single positive cells scattered over one or different lobules. Two types of mosaics were characteristic of all the transgenic animals, although, females carrying the pCLZ2 transgene showed "lobule" type more often than transgenic animals with the transgenes pCLZ and pCLZ3. It is suggested that the stochastic type of mosaics occurs in the cells at terminal stage of differentiation, whereas the type arises from positive for P-galactosidase proliferating precursors. Analysis of the inheritance of the transgenes in different lines demonstrated that the pCLZl transgene was inserted in the X-chromosome of the founder whereas the other two localized in autosomes. Localization of the pCLZl transgene in the X-chromosome did not influence the mosaicism; it was similar to that of transgenic animals carrying the transgenes in autosomes. Ectopic expression of the reporter-gene was detected in mandibular glands from the offsprings of the founders N 16 and N 37 only, as well as in atrezed follicles in N 37. The weak ectopic expression saggests that the 5 S-flanked regulatory sequences used in the constructs are able to provide perfect tissue

  1. Sequence variations in the transcriptional regulatory region and intron1 of HLA-DQB1 gene and their linkage in southern Chinese ethnic groups.

    PubMed

    Xu, Yunping; Hu, Qingsong; Liu, Zehuan; Shen, Yang; Liu, Xiaoyi; Lin, Bin; Wu, Yuping; Chen, Shangwu; Xu, Anlong

    2005-08-01

    Sequence polymorphism in the transcriptional regulatory region extending to intron1 (DQRRI1) of HLA-DQB1 gene, and their haplotypic distributions were investigated in southern Chinese populations. We cloned and sequenced a 1.1-kb segment containing the transcriptional regulatory region, exon1, and partial intron1 of HLA-DQB1 gene of 37 individuals from nine different ethnic groups in southern China. A high-density map of 162 polymorphisms, including 152 single nucleotide polymorphisms (SNPs) and 10 insertion-deletion polymorphisms, was revealed. By comparing these data with SNPs deposited in dbSNP database in National Center for Biotechnology Information and polymorphisms that have been reported, 66 polymorphisms were firstly reported. A total of 16 different haplotypes were detected based on these 162 polymorphisms. The distribution of 16 alleles of DQRRI1 as well as their linkage with DQB1 exon2 alleles was also investigated based on the population study and phylogenetic analysis. Tight linkage between these two regions were discovered, as each of DQB1*02, DQB1*03, DQB1*04, DQB1*05, and DQB1*06 alleles was seen to be linked with specific DQRRI1 allele. Our study showed different pattern of transcriptional regulatory region, exon1, and intron1 of different DQB1 alleles.

  2. Structural analysis of the regulatory elements of the type-II procollagen gene. Conservation of promoter and first intron sequences between human and mouse.

    PubMed Central

    Vikkula, M; Metsäranta, M; Syvänen, A C; Ala-Kokko, L; Vuorio, E; Peltonen, L

    1992-01-01

    Transcription of the type-II procollagen gene (COL2A1) is very specifically restricted to a limited number of tissues, particularly cartilages. In order to identify transcription-control motifs we have sequenced the promoter region and the first intron of the human and mouse COL2A1 genes. With the assumption that these motifs should be well conserved during evolution, we have searched for potential elements important for the tissue-specific transcription of the COL2A1 gene by aligning the two sequences with each other and with the available rat type-II procollagen sequence for the promoter. With this approach we could identify specific evolutionarily well-conserved motifs in the promoter area. On the other hand, several suggested regulatory elements in the promoter region did not show evolutionary conservation. In the middle of the first intron we found a cluster of well-conserved transcription-control elements and we conclude that these conserved motifs most probably possess a significant function in the control of the tissue-specific transcription of the COL2A1 gene. We also describe locations of additional, highly conserved nucleotide stretches, which are good candidate regions in the search for binding sites of yet-uncharacterized cartilage-specific transcription regulators of the COL2A1 gene. PMID:1637314

  3. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans

    PubMed Central

    Khachatoorian, Careen; Judelson, Howard S.

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies. PMID:26716454

  4. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans.

    PubMed

    Poidevin, Laetitia; Andreeva, Kalina; Khachatoorian, Careen; Judelson, Howard S

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies. PMID:26716454

  5. Identification of cis-regulatory sequence variations in individual genome sequences.

    PubMed

    Worsley-Hunt, Rebecca; Bernard, Virginie; Wasserman, Wyeth W

    2011-01-01

    Functional contributions of cis-regulatory sequence variations to human genetic disease are numerous. For instance, disrupting variations in a HNF4A transcription factor binding site upstream of the Factor IX gene contributes causally to hemophilia B Leyden. Although clinical genome sequence analysis currently focuses on the identification of protein-altering variation, the impact of cis-regulatory mutations can be similarly strong. New technologies are now enabling genome sequencing beyond exomes, revealing variation across the non-coding 98% of the genome responsible for developmental and physiological patterns of gene activity. The capacity to identify causal regulatory mutations is improving, but predicting functional changes in regulatory DNA sequences remains a great challenge. Here we explore the existing methods and software for prediction of functional variation situated in the cis-regulatory sequences governing gene transcription and RNA processing.

  6. Identification of cis-regulatory sequence variations in individual genome sequences

    PubMed Central

    2011-01-01

    Functional contributions of cis-regulatory sequence variations to human genetic disease are numerous. For instance, disrupting variations in a HNF4A transcription factor binding site upstream of the Factor IX gene contributes causally to hemophilia B Leyden. Although clinical genome sequence analysis currently focuses on the identification of protein-altering variation, the impact of cis-regulatory mutations can be similarly strong. New technologies are now enabling genome sequencing beyond exomes, revealing variation across the non-coding 98% of the genome responsible for developmental and physiological patterns of gene activity. The capacity to identify causal regulatory mutations is improving, but predicting functional changes in regulatory DNA sequences remains a great challenge. Here we explore the existing methods and software for prediction of functional variation situated in the cis-regulatory sequences governing gene transcription and RNA processing. PMID:21989199

  7. Functional relevance of specific interactions between herpes simplex virus type 1 ICP4 and sequences from the promoter-regulatory domain of the viral thymidine kinase gene.

    PubMed Central

    Imbalzano, A N; Shepard, A A; DeLuca, N A

    1990-01-01

    The herpes simplex virus (HSV) type 1 immediate-early regulatory protein ICP4 is required for induced expression of HSV early and late genes, yet the mechanism by which this occurs is not known. We examined the promoter and flanking sequences of the HSV early gene that encodes thymidine kinase for the ability to interact specifically with ICP4 in gel retardation assays. Protein-DNA complexes containing ICP4 were observed with several distinct regions flanking the tk promoter. cis-Acting elements that interact with cellular transcription factors were apparently not required for these interactions to form. Purified ICP4 formed protein-DNA complexes with fragments from these regions, and Southwestern (DNA-protein blot) analysis indicated that the interaction between ICP4 and these sequences can be direct. None of the tk sequences that interact with ICP4 contains a consensus binding site for ICP4 (S. W. Faber and K. W. Wilcox, Nucleic Acids Res. 14:6067-6083, 1986), reflecting the ability of ICP4 to interact with more than one DNA sequence. A mutated ICP4 protein expressed from the viral genome that retains the ability to bind to a consensus binding site but does not bind specifically to the identified sites flanking the tk promoter results in induced transcription of the tk gene. These data support hypotheses for ICP4-mediated transactivation of the tk promoter in Vero cells that do not require the intrinsic ability of ICP4 to bind specifically in or near the promoter of the tk gene. Images PMID:2159535

  8. Vision from next generation sequencing: multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease.

    PubMed

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-05-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of "gene" itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  9. Vision from next generation sequencing: Multi-dimensional genome-wide analysis for producing gene regulatory networks underlying retinal development, aging and disease

    PubMed Central

    Yang, Hyun-Jin; Ratnapriya, Rinki; Cogliati, Tiziana; Kim, Jung-Woong; Swaroop, Anand

    2015-01-01

    Genomics and genetics have invaded all aspects of biology and medicine, opening uncharted territory for scientific exploration. The definition of “gene” itself has become ambiguous, and the central dogma is continuously being revised and expanded. Computational biology and computational medicine are no longer intellectual domains of the chosen few. Next generation sequencing (NGS) technology, together with novel methods of pattern recognition and network analyses, has revolutionized the way we think about fundamental biological mechanisms and cellular pathways. In this review, we discuss NGS-based genome-wide approaches that can provide deeper insights into retinal development, aging and disease pathogenesis. We first focus on gene regulatory networks (GRNs) that govern the differentiation of retinal photoreceptors and modulate adaptive response during aging. Then, we discuss NGS technology in the context of retinal disease and develop a vision for therapies based on network biology. We should emphasize that basic strategies for network construction and analyses can be transported to any tissue or cell type. We believe that specific and uniform guidelines are required for generation of genome, transcriptome and epigenome data to facilitate comparative analysis and integration of multi-dimensional data sets, and for constructing networks underlying complex biological processes. As cellular homeostasis and organismal survival are dependent on gene-gene and gene-environment interactions, we believe that network-based biology will provide the foundation for deciphering disease mechanisms and discovering novel drug targets for retinal neurodegenerative diseases. PMID:25668385

  10. A site-specific, single-copy transgenesis strategy to identify 5' regulatory sequences of the mouse testis-determining gene Sry.

    PubMed

    Quinn, Alexander; Kashimada, Kenichi; Davidson, Tara-Lynne; Ng, Ee Ting; Chawengsaksophak, Kallayanee; Bowles, Josephine; Koopman, Peter

    2014-01-01

    The Y-chromosomal gene SRY acts as the primary trigger for male sex determination in mammalian embryos. Correct regulation of SRY is critical: aberrant timing or level of Sry expression is known to disrupt testis development in mice and we hypothesize that mutations that affect regulation of human SRY may account for some of the many cases of XY gonadal dysgenesis that currently remain unexplained. However, the cis-sequences involved in regulation of Sry have not been identified, precluding a test of this hypothesis. Here, we used a transgenic mouse approach aimed at identifying mouse Sry 5' flanking regulatory sequences within 8 kb of the Sry transcription start site (TSS). To avoid problems associated with conventional pronuclear injection of transgenes, we used a published strategy designed to yield single-copy transgene integration at a defined, transcriptionally open, autosomal locus, Col1a1. None of the Sry transgenes tested was expressed at levels compatible with activation of Sox9 or XX sex reversal. Our findings indicate either that the Col1a1 locus does not provide an appropriate context for the correct expression of Sry transgenes, or that the cis-sequences required for Sry expression in the developing gonads lie beyond 8 kb 5' of the TSS.

  11. Comparative inter-strain sequence analysis of the putative regulatory region of murine psychostimulant-regulated gene GNB1 (G protein beta 1 subunit gene).

    PubMed

    Kitanaka, Nobue; Kitanaka, Junichi; Walther, Donna; Wang, Xiao-Bing; Uhl, George R

    2003-08-01

    We isolated a cDNA clone from a murine genomic library of C57BL/6 strain, carrying 13.8 kb of nucleotides including exon 1 of heterotrimeric GTP-binding protein beta 1 subunit gene (genetic symbol, GNB1) and 10.6 kb of the 5' flanking region. Sequence comparison with GNB1 gene locus from 129Sv strain revealed a 0.2% divergence in a 13.2 kb common region between these two strains. The divergence consisted of eight single nucleotide polymorphisms, three insertions and one deletion, with 129Sv used as the reference. The exon 1 and the putative regulation elements, such as cyclic AMP response element, AP1, AP2, Sp1 and nuclear factor-kappa B recognition sites, were perfectly conserved. The expression of GNB1 mRNA was significantly increased in mouse striatum 2 h after single methamphetamine administration with an approximately 150% expression level compared with the basal level. In contrast, no change in the expression level was observed in the cerebral cortex. After the chronic methamphetamine treatment regimen, the expression level of GNB1 mRNA did not change in any brain regions examined. These results suggest (1) that the 5' flanking nucleotide sequence of GNB1 gene was strictly conserved for its possible contribution to the same change in the expression level between the mouse strains in response to psychostimulants and (2) that the initial process of development of behavioral sensitization appeared to occur parallel to the significant increase in the expression level of GNB1 gene in the mouse striatum. PMID:14631649

  12. Comparison of loline alkaloid gene clusters across fungal endophytes: predicting the co-regulatory sequence motifs and the evolutionary history.

    PubMed

    Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H

    2007-10-01

    LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.

  13. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  14. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  15. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  16. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks. PMID:27326708

  17. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks.

  18. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  19. The complete sequence of the human CD79b (Ig{beta}/B29) gene: Identification of a conserved exon/intron organization, immunoglobulin-like regulatory regions, and allelic polymorphism

    SciTech Connect

    Hashimoto, S.; Chiorazzi, N.; Gregersen, P.K. |

    1994-12-31

    We determined the complete genomic sequence of the human CD79b (Ig{beta}/B29) gene. The CD79b gene product is associated with the membrane immunoglobulin signaling complex which is composed of immunoglobulin (Ig) itself, associated in a noncovalent fashion with CD79b and a second polypeptide chain, CD79a (Ig{alpha}/mb1). The sequence and exon/intron organization of the human and mouse CD79b genes are highly similar. The gene organization suggests that some variant forms of CD79b may arise by virtue of alternative splicing of mRNA. In addition, a number of conserved regulatory sequences commonly found in Ig genes are present in sequences which flank the human CD79b gene. Some of these sequences are distinct from those found in the CD79a promoter. These differences may explain why transcription of CD79b, but not CD79a, is observed in plasma cells. A new Taq 1 restriction fragment length polymorphism is described that is not associated with any structural polymorphisms of the expressed CD79b polypeptide. 13 refs., 3 figs., 1 tab.

  20. Discovery of posttranscriptional regulatory RNAs using next generation sequencing technologies.

    PubMed

    Gelderman, Grant; Contreras, Lydia M

    2013-01-01

    Next generation sequencing (NGS) has revolutionized the way by which we engineer metabolism by radically altering the path to genome-wide inquiries. This is due to the fact that NGS approaches offer several powerful advantages over traditional methods that include the ability to fully sequence hundreds to thousands of genes in a single experiment and simultaneously detect homozygous and heterozygous deletions, alterations in gene copy number, insertions, translocations, and exome-wide substitutions that include "hot-spot mutations." This chapter describes the use of these technologies as a sequencing technique for transcriptome analysis and discovery of regulatory RNA elements in the context of three main platforms: Illumina HiSeq, 454 pyrosequencing, and SOLiD sequencing. Specifically, this chapter focuses on the use of Illumina HiSeq, since it is the most widely used platform for RNA discovery and transcriptome analysis. Regulatory RNAs have now been found in all branches of life. In bacteria, noncoding small RNAs (sRNAs) are involved in highly sophisticated regulatory circuits that include quorum sensing, carbon metabolism, stress responses, and virulence (Gorke and Vogel, Gene Dev 22:2914-2925, 2008; Gottesman, Trends Genet 21:399-404, 2005; Romby et al., Curr Opin Microbiol 9:229-236, 2006). Further characterization of the underlying regulation of gene expression remains poorly understood given that it is estimated that over 60% of all predicted genes remain hypothetical and the 5' and 3' untranslated regions are unknown for more than 90% of the genes (Siegel et al., Trends Parasitol 27:434-441, 2011). Importantly, manipulation of the posttranscriptional regulation that occurs at the level of RNA stability and export, trans-splicing, polyadenylation, protein translation, and protein stability via untranslated regions (Clayton, EMBO J 21:1881-1888, 2002; Haile and Papadopoulou, Curr Opin Microbiol 10:569-577, 2007) could be highly beneficial to metabolic

  1. Expression of the human granulocyte-macrophage colony stimulating factor (hGM-CSF) gene under control of the 5'-regulatory sequence of the goat alpha-S1-casein gene with and without a MAR element in transgenic mice.

    PubMed

    Burkov, I A; Serova, I A; Battulin, N R; Smirnov, A V; Babkin, I V; Andreeva, L E; Dvoryanchikov, G A; Serov, O L

    2013-10-01

    Expression of the human granulocyte-macrophage colony-stimulating factor (hGM-CSF) gene under the control of the 5'-regulatory sequence of the goat alpha-S1-casein gene with and without a matrix attachment region (MAR) element from the Drosophila histone 1 gene was studied in four and eight transgenic mouse lines, respectively. Of the four transgenic lines carrying the transgene without MAR, three had correct tissues-specific expression of the hGM-CSF gene in the mammary gland only and no signs of cell mosaicism. The concentration of hGM-CSF in the milk of transgenic females varied from 1.9 to 14 μg/ml. One line presented hGM-CSF in the blood serum, indicating ectopic expression. The values of secretion of hGM-CSF in milk of 6 transgenic lines carrying the transgene with MAR varied from 0.05 to 0.7 μg/ml, and two of these did not express hGM-CSF. Three of the four examined animals from lines of this group showed ectopic expression of the hGM-CSF gene, as determined by RT-PCR and immunofluorescence analyses, as well as the presence of hGM-CSF in the blood serum. Mosaic expression of the hGM-CSF gene in mammary epithelial cells was specific to all examined transgenic mice carrying the transgene with MAR but was never observed in the transgenic mice without MAR. The mosaic expression was not dependent on transgene copy number. Thus, the expected "protective or enhancer effect" from the MAR element on the hGM-CSF gene expression was not observed.

  2. Transcription factor trapping by RNA in gene regulatory elements.

    PubMed

    Sigova, Alla A; Abraham, Brian J; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M; Guo, Yang Eric; Jangi, Mohini; Giallourakis, Cosmas C; Sharp, Phillip A; Young, Richard A

    2015-11-20

    Transcription factors (TFs) bind specific sequences in promoter-proximal and -distal DNA elements to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF Yin-Yang 1 (YY1) binds to both gene regulatory elements and their associated RNA species across the entire genome. Reduced transcription of regulatory elements diminishes YY1 occupancy, whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive-feedback loop that contributes to the stability of gene expression programs.

  3. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    SciTech Connect

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  4. Transcriptome Sequencing from Diverse Human Populations Reveals Differentiated Regulatory Architecture

    PubMed Central

    Lappalainen, Tuuli; Henn, Brenna M.; Kidd, Jeffrey M.; Yee, Muh-Ching; Grubert, Fabian; Cann, Howard M.; Snyder, Michael; Montgomery, Stephen B.; Bustamante, Carlos D.

    2014-01-01

    Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP). The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and regulatory genetics

  5. A downstream regulatory element located within the coding sequence mediates autoregulated expression of the yeast fatty acid synthase gene FAS2 by the FAS1 gene product.

    PubMed

    Wenz, P; Schwank, S; Hoja, U; Schüller, H J

    2001-11-15

    The fatty acid synthase genes FAS1 and FAS2 of the yeast Saccharomyces cerevisiae are transcriptionally co-regulated by general transcription factors (such as Reb1, Rap1 and Abf1) and by the phospholipid-specific heterodimeric activator Ino2/Ino4, acting via their corresponding upstream binding sites. Here we provide evidence for a positive autoregulatory influence of FAS1 on FAS2 expression. Even with a constant FAS2 copy number, a 10-fold increase of FAS2 transcript amount was observed in the presence of FAS1 in multi-copy, compared to a fas1 null mutant. Surprisingly, the first 66 nt of the FAS2 coding region turned out as necessary and sufficient for FAS1-dependent gene expression. FAS2-lacZ fusion constructs deleted for this region showed high reporter gene expression even in the absence of FAS1, arguing for a negatively-acting downstream repression site (DRS) responsible for FAS1-dependent expression of FAS2. Our data suggest that the FAS1 gene product, in addition to its catalytic function, is also required for the coordinate biosynthetic control of the yeast FAS complex. An excess of uncomplexed Fas1 may be responsible for the deactivation of an FAS2-specific repressor, acting via the DRS. PMID:11713312

  6. Characterization of DNA sequences that mediate nuclear protein binding to the regulatory region of the Pisum sativum (pea) chlorophyl a/b binding protein gene AB80: identification of a repeated heptamer motif.

    PubMed

    Argüello, G; García-Hernández, E; Sánchez, M; Gariglio, P; Herrera-Estrella, L; Simpson, J

    1992-05-01

    Two protein factors binding to the regulatory region of the pea chlorophyl a/b binding protein gene AB80 have been identified. One of these factors is found only in green tissue but not in etiolated or root tissue. The second factor (denominated ABF-2) binds to a DNA sequence element that contains a direct heptamer repeat TCTCAAA. It was found that presence of both of the repeats is essential for binding. ABF-2 is present in both green and etiolated tissue and in roots and factors analogous to ABF-2 are present in several plant species. Computer analysis showed that the TCTCAAA motif is present in the regulatory region of several plant genes. PMID:1303797

  7. Definition of a GC-rich motif as regulatory sequence of the human IL-3 gene: coordinate regulation of the IL-3 gene by CLE2/GC box of the GM-CSF gene in T cell activation.

    PubMed

    Nishida, J; Yoshida, M; Arai, K; Yokota, T

    1991-03-01

    The human IL-3 gene, located on chromosome 5, contains several cis-acting DNA sequences, i.e. CLE (conserved lymphokine element) and a GC-rich region, similar to the GM-CSF gene. To investigate the role of these elements, the 5' flanking region of the IL-3 gene was attached to a bacterial chloramphenicol acetyltransferase (CAT) gene. The fusion plasmids were analyzed by an in vitro transcription system using Jurkat cell nuclear extract prepared from cells stimulated with phorbol-12-myristate-13-acetate and calcium ionophore (PMA/A23187), introduced into Jurkat cells, expressed transiently, and stimulated by co-transfection of human T cell leukemia virus type I (HTLV-I) encoded transactivator, p40tax. The GC-rich region enhanced TATA-dependent transcription in the in vitro transcription system and also strongly responded to p40tax stimulation in the in vivo cotransfection assay. Using this GC-rich region as a probe, we identified a constitutive DNA-protein complex, alpha, whose binding specificity correlates with transcription activity. However, this element is not sufficient for the expression of the IL-3 gene in response to T cell activation signals (PMA/A23187) and no sequence was found within the IL-3 gene which mediates the response to PMA/A23187. The enhancer sequence which responds to T cell activation signals may be located outside the IL-3 gene and may be shared by other lymphokines, possibly by GM-CSF. We propose that the GM-CSF enhancer (CLE2/GC box) which mediates the response to T cell activation signals may stimulate the expression of the IL-3 gene. PMID:2049340

  8. Evolving Robust Gene Regulatory Networks

    PubMed Central

    Noman, Nasimul; Monjo, Taku; Moscato, Pablo; Iba, Hitoshi

    2015-01-01

    Design and implementation of robust network modules is essential for construction of complex biological systems through hierarchical assembly of ‘parts’ and ‘devices’. The robustness of gene regulatory networks (GRNs) is ascribed chiefly to the underlying topology. The automatic designing capability of GRN topology that can exhibit robust behavior can dramatically change the current practice in synthetic biology. A recent study shows that Darwinian evolution can gradually develop higher topological robustness. Subsequently, this work presents an evolutionary algorithm that simulates natural evolution in silico, for identifying network topologies that are robust to perturbations. We present a Monte Carlo based method for quantifying topological robustness and designed a fitness approximation approach for efficient calculation of topological robustness which is computationally very intensive. The proposed framework was verified using two classic GRN behaviors: oscillation and bistability, although the framework is generalized for evolving other types of responses. The algorithm identified robust GRN architectures which were verified using different analysis and comparison. Analysis of the results also shed light on the relationship among robustness, cooperativity and complexity. This study also shows that nature has already evolved very robust architectures for its crucial systems; hence simulation of this natural process can be very valuable for designing robust biological systems. PMID:25616055

  9. Learning About Gene Regulatory Networks From Gene Deletion Experiments

    PubMed Central

    Brazma, Alvis

    2002-01-01

    Gene regulatory networks are a major focus of interest in molecular biology. A crucial question is how complex regulatory systems are encoded and controlled by the genome. Three recent publications have raised the question of what can be learned about gene regulatory networks from microarray experiments on gene deletion mutants. Using this indirect approach, topological features such as connectivity and modularity have been studied. PMID:18629255

  10. The mouse gene for vascular endothelial growth factor. Genomic structure, definition of the transcriptional unit, and characterization of transcriptional and post-transcriptional regulatory sequences.

    PubMed

    Shima, D T; Kuroki, M; Deutsch, U; Ng, Y S; Adamis, A P; D'Amore, P A

    1996-02-16

    We describe the genomic organization and functional characterization of the mouse gene encoding vascular endothelial growth factor (VEGF), a polypeptide implicated in embryonic vascular development and postnatal angiogenesis. The coding region for mouse VEGF is interrupted by seven introns and encompasses approximately 14 kilobases. Organization of exons suggests that, similar to the human VEGF gene, alternative splicing generates the 120-, 164-, and 188-amino acid isoforms, but does not predict a fourth VEGF isoform corresponding to human VEGF206. Approximately 1. 2 kilobases of 5'-flanking region have been sequenced, and primer extension analysis identified a single major transcription initiation site, notably lacking TATA or CCAT consensus sequences. The 5'-flanking region is sufficient to promote a 7-fold induction of basal transcription. The genomic region encoding the 3'-untranslated region was determined by Northern and nuclease mapping analysis. Investigation of mRNA sequences responsible for the rapid turnover of VEGF mRNA (mRNA half-life, <1 h) (Shima, D. T. , Deutsch, U., and D'Amore, P. A. (1995) FEBS Lett. 370, 203-208) revealed that the 3'-untranslated region was sufficient to trigger the rapid turnover of a normally long-lived reporter mRNA in vitro. These data and reagents will allow the molecular and genetic analysis of mechanisms that control the developmental and pathological expression of VEGF.

  11. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  12. A CRE/ATF-like site in the upstream regulatory sequence of the human interleukin 1 beta gene is necessary for induction in U937 and THP-1 monocytic cell lines.

    PubMed Central

    Gray, J G; Chandra, G; Clay, W C; Stinnett, S W; Haneline, S A; Lorenz, J J; Patel, I R; Wisely, G B; Furdon, P J; Taylor, J D

    1993-01-01

    Transfection of U937 and THP-1 cells with a recombinant plasmid, pIL1(4.0kb)-CAT, containing 4 kb of the interleukin 1 beta (IL-1 beta) gene upstream regulatory sequence resulted in inducer-dependent expression of chloramphenicol acetyltransferase activity. Treatment of the transfected cells with various combinations of the inducers lipopolysaccharide, phorbol myristate acetate, and dibutyryl cyclic AMP upregulated the IL-1 beta promoter. In U937 and THP-1 cells, maximum stimulation of both the endogenous IL-1 beta gene and pIL1(4.0kb)-CAT transfectants was observed following treatment with the combination of inducing agents lipopolysaccharide-phorbol myristate acetate-dibutyryl cyclic AMP. This combination of inducing agents was used to identify and study, at the molecular level, some of the regulatory elements necessary for induction of the IL-1 beta gene. A series of 5' deletion derivatives of the upstream regulatory sequence were used in transient transfection assays to identify an 80-bp fragment located between -2720 and -2800 bp upstream of the mRNA start site that was required for induction. Exonuclease III mapping, electrophoretic mobility shift assays (EMSA), and DNA sequence analysis of this region were used to identify a transcription factor binding sequence which contained a potential cyclic AMP response element (CRE/ATF)- and NF-kappa B-like binding site. Site-directed mutagenesis of the CRE/ATF-like site resulted in the loss of binding of a specific factor or factors as determined by EMSA. The loss of binding activity directly correlated with a loss of approximately 75% of promoter activity as determined in transient transfection assays. As determined by EMSA, the factor binding to the CRE/ATF-like site was present in nuclear extracts prepared from both uninduced and induced THP-1 and U937 cells. However, the intensity of the band appeared to be increased when nuclear extracts from induced cells were used. In contrast to the CRE/ATF mutation, which

  13. Multiple regulatory mechanisms of hepatocyte growth factor expression in malignant cells with a short poly(dA) sequence in the HGF gene promoter.

    PubMed

    Sakai, Kazuko; Takeda, Masayuki; Okamoto, Isamu; Nakagawa, Kazuhiko; Nishio, Kazuto

    2015-01-01

    Hepatocyte growth factor (HGF) expression is a poor prognostic factor in various types of cancer. Expression levels of HGF have been reported to be regulated by shorter poly(dA) sequences in the promoter region. In the present study, the poly(dA) mononucleotide tract in various types of human cancer cell lines was examined and compared with the HGF expression levels in those cells. Short deoxyadenosine repeat sequences were detected in five of the 55 cell lines used in the present study. The H69, IM95, CCK-81, Sui73 and H28 cells exhibited a truncated poly(dA) sequence in which the number of poly(dA) repeats was reduced by ≥5 bp. Two of the cell lines exhibited high HGF expression, determined by reverse transcription quantitative polymerase chain reaction and enzyme-linked immunosorbent assay. The CCK-81, Sui73 and H28 cells with shorter poly(dA) sequences exhibited low HGF expression. The cause of the suppression of HGF expression in the CCK-81, Sui73 and H28 cells was clarified by two approaches, suppression by methylation and single nucleotide polymorphisms in the HGF gene. Exposure to 5-Aza-dC, an inhibitor of DNA methyltransferase 1, induced an increased expression of HGF in the CCK-81 cells, but not in the other cells. Single-nucleotide polymorphism (SNP) rs72525097 in intron 1 was detected in the Sui73 and H28 cells. Taken together, it was found that the defect of poly(dA) in the HGF promoter was present in various types of cancer, including lung, stomach, colorectal, pancreas and mesothelioma. The present study proposes the negative regulation mechanisms by methylation and SNP in intron 1 of HGF for HGF expression in cancer cells with short poly(dA).

  14. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

    PubMed Central

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-01-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230

  15. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

    PubMed

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-10-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes.

  16. Toucan: deciphering the cis-regulatory logic of coregulated genes

    PubMed Central

    Aerts, Stein; Thijs, Gert; Coessens, Bert; Staes, Mik; Moreau, Yves; De Moor, Bart

    2003-01-01

    TOUCAN is a Java application for the rapid discovery of significant cis-regulatory elements from sets of coexpressed or coregulated genes. Biologists can automatically (i) retrieve genes and intergenic regions, (ii) identify putative regulatory regions, (iii) score sequences for known transcription factor binding sites, (iv) identify candidate motifs for unknown binding sites, and (v) detect those statistically over-represented sites that are characteristic for a gene set. Genes or intergenic regions are retrieved from Ensembl or EMBL, together with orthologs and supporting information. Orthologs are aligned and syntenic regions are selected as candidate regulatory regions. Putative sites for known transcription factors are detected using our MotifScanner, which scores position weight matrices using a probabilistic model. New motifs are detected using our MotifSampler based on Gibbs sampling. Binding sites characteristic for a gene set—and thus statistically over-represented with respect to a reference sequence set—are found using a binomial test. We have validated Toucan by analyzing muscle-specific genes, liver-specific genes and E2F target genes; we have easily detected many known binding sites within intergenic DNA and identified new biologically plausible sites for known and unknown transcription factors. Software available at http://www.esat.kuleuven.ac.be/∼dna/BioI/Software.html. PMID:12626717

  17. A Xenopus laevis gene encoding EF-1 alpha S, the somatic form of elongation factor 1 alpha: sequence, structure, and identification of regulatory elements required for embryonic transcription.

    PubMed

    Johnson, A D; Krieg, P A

    1995-01-01

    Transcription of the Xenopus laevis EF-1 alpha S gene commences at the mid-blastula stage of embryonic development and then continues constitutively in all somatic tissues. The EF-1 alpha S promoter is extremely active in the early Xenopus embryo where EF-1 alpha S transcripts account for as much as 40% of all new polyadenylated transcripts. We have isolated the Xenopus EF-1 alpha S gene and used microinjection techniques to identify promoter elements responsible for embryonic transcription. These in vivo expression studies have identified an enhancer fragment, located approximately 4.4 kb upstream of the transcription start site, that is required for maximum expression from the EF-1 alpha S promoter. The enhancer fragment contains both an octamer and a G/C box sequence, but mutation studies indicate that the octamer plays no significant role in regulation of EF-1 alpha S expression in the embryo. The presence of a G/C element in the enhancer and of multiple G/C boxes in the proximal promoter region suggests that the G/C box binding protein, Sp1, plays a major role in the developmental regulation of EF-1 alpha S promoter activity. PMID:8565334

  18. Formation of Regulatory Modules by Local Sequence Duplication

    PubMed Central

    Nourmohammad, Armita; Lässig, Michael

    2011-01-01

    Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms. PMID:21998564

  19. Formation of regulatory modules by local sequence duplication.

    PubMed

    Nourmohammad, Armita; Lässig, Michael

    2011-10-01

    Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms.

  20. Genetic analysis of bristle loss in hybrids between Drosophila melanogaster and D. simulans provides evidence for divergence of cis-regulatory sequences in the achaete-scute gene complex.

    PubMed

    Skaer, N; Simpson, P

    2000-05-01

    The two closely related species of Drosophila, D. melanogaster and D. simulans, display an identical bristle pattern on the notum, but hybrids between the two are lacking a variable number of bristles. We show that the loss is temperature-dependent and provide evidence for two periods of temperature sensitivity. A first period of heat sensitivity occurs during larval development and corresponds to the time when the prepattern of expression of genes whose products activate achaete-scute in the proneural clusters preceding bristle precursor formation is established. A second period of cold sensitivity corresponds to the time of emergence of the bristle precursor cells and the maintenance of their neural fate, a process requiring high levels of Achaete-Scute. Expression of achaete-scute at these two critical periods depends on cis-regulatory elements of the achaete-scute complex (AS-C). The differences between males, which have only one copy of the X-linked AS-C from D. simulans, and females, which have copies from both parental species, are compared, together with the effects of crossing in different rearrangements of the D. melanogaster AS-C that delete regulatory and/or coding sequences. We provide evidence that bristle loss in the hybrids may result from a decrease in the level of transcription at the AS-C and argue that interaction between trans-acting factors and cis-regulatory elements within the AS-C has diverged between the two species.

  1. Dynamic chromatin: the regulatory domain organization of eukaryotic gene loci.

    PubMed

    Bonifer, C; Hecht, A; Saueressig, H; Winter, D M; Sippel, A E

    1991-10-01

    It is hypothesized that nuclear DNA is organized in topologically constrained loop domains defining basic units of higher order chromatin structure. Our studies are performed in order to investigate the functional relevance of this structural subdivision of eukaryotic chromatin for the control of gene expression. We used the chicken lysozyme gene locus as a model to examine the relation between chromatin structure and gene function. Several structural features of the lysozyme locus are known: the extension of the region of general DNAasel sensitivity of the active gene, the location of DNA-sequences with high affinity for the nuclear matrix in vitro, and the position of DNAasel hypersensitive chromatin sites (DHSs). The pattern of DHSs changes depending on the transcriptional status of the gene. Functional studies demonstrated that DHSs mark the position of cis-acting regulatory elements. Additionally, we discovered a novel cis-activity of the border regions of the DNAasel sensitive domain (A-elements). By eliminating the position effect on gene expression usually observed when genes are randomly integrated into the genome after transfection, A-elements possibly serve as punctuation marks for a regulatory chromatin domain. Experiments using transgenic mice confirmed that the complete structurally defined lysozyme gene domain behaves as an independent regulatory unit, expressing the gene in a tissue specific and position independent manner. These expression features were lost in transgenic mice carrying a construct, in which the A-elements as well as an upstream enhancer region were deleted, indicating the lack of a locus activation function on this construct. Experiments are designed in order to uncover possible hierarchical relationships between the different cis-acting regulatory elements for stepwise gene activation during cell differentiation. We are aiming at the definition of the basic structural and functional requirements for position independent and high

  2. Dynamic chromatin: the regulatory domain organization of eukaryotic gene loci.

    PubMed

    Bonifer, C; Hecht, A; Saueressig, H; Winter, D M; Sippel, A E

    1991-10-01

    It is hypothesized that nuclear DNA is organized in topologically constrained loop domains defining basic units of higher order chromatin structure. Our studies are performed in order to investigate the functional relevance of this structural subdivision of eukaryotic chromatin for the control of gene expression. We used the chicken lysozyme gene locus as a model to examine the relation between chromatin structure and gene function. Several structural features of the lysozyme locus are known: the extension of the region of general DNAasel sensitivity of the active gene, the location of DNA-sequences with high affinity for the nuclear matrix in vitro, and the position of DNAasel hypersensitive chromatin sites (DHSs). The pattern of DHSs changes depending on the transcriptional status of the gene. Functional studies demonstrated that DHSs mark the position of cis-acting regulatory elements. Additionally, we discovered a novel cis-activity of the border regions of the DNAasel sensitive domain (A-elements). By eliminating the position effect on gene expression usually observed when genes are randomly integrated into the genome after transfection, A-elements possibly serve as punctuation marks for a regulatory chromatin domain. Experiments using transgenic mice confirmed that the complete structurally defined lysozyme gene domain behaves as an independent regulatory unit, expressing the gene in a tissue specific and position independent manner. These expression features were lost in transgenic mice carrying a construct, in which the A-elements as well as an upstream enhancer region were deleted, indicating the lack of a locus activation function on this construct. Experiments are designed in order to uncover possible hierarchical relationships between the different cis-acting regulatory elements for stepwise gene activation during cell differentiation. We are aiming at the definition of the basic structural and functional requirements for position independent and high

  3. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  4. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  5. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila

    PubMed Central

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R.

    2016-01-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  6. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila.

    PubMed

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R

    2016-09-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  7. Differences in regulatory sequences of naturally occurring JC virus variants.

    PubMed Central

    Martin, J D; King, D M; Slauch, J M; Frisque, R J

    1985-01-01

    The regulatory region was sequenced for DNAs representative of seven independent isolates of JC virus, the probable agent of progressive multifocal leukoencephalopathy. The isolates included an oncogenic variant (MAD-4), an antigenic variant (MAD-11), and two different isolates derived from the urine (MAD-7) and from the brain (MAD-8) of the same patient. The representative DNAs were molecularly cloned directly from diseased brain tissue and from human fetal glial cells infected with the corresponding isolated viruses. The regulatory sequences of these DNAs were compared with those of the prototype isolate, MAD-1, sequenced previously (R. J. Frisque, J. Virol. 46:170-176, 1983). We found that the regulatory region of JC viral DNA is highly variable due to complex alterations of the previously described 98-base-pair repeat of MAD-1 DNA. On the basis of these alterations, there are two general types of JC virus. There were no consistent alterations in regulatory sequences which could distinguish brain tissue DNAs from tissue culture DNAs. Furthermore, for each isolate except MAD-1 (R. J. Frisque, J. Virol. 46:170-176, 1983), the regulatory regions of brain tissue and tissue culture DNAs were not identical. The arrangement, sequence, or both of potential regulatory elements (TATA sequence, GGGXGGPuPu, tandem repeats) of JC viral DNAs are sufficiently different from those in other viral and eucaryotic systems that they may effect the unique properties of this slow virus. PMID:2981353

  8. Massive contribution of transposable elements to mammalian regulatory sequences.

    PubMed

    Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam

    2016-09-01

    Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes. PMID:27174439

  9. Massive contribution of transposable elements to mammalian regulatory sequences.

    PubMed

    Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam

    2016-09-01

    Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes.

  10. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences.

  11. Phenotypic switching in gene regulatory networks.

    PubMed

    Thomas, Philipp; Popović, Nikola; Grima, Ramon

    2014-05-13

    Noise in gene expression can lead to reversible phenotypic switching. Several experimental studies have shown that the abundance distributions of proteins in a population of isogenic cells may display multiple distinct maxima. Each of these maxima may be associated with a subpopulation of a particular phenotype, the quantification of which is important for understanding cellular decision-making. Here, we devise a methodology which allows us to quantify multimodal gene expression distributions and single-cell power spectra in gene regulatory networks. Extending the commonly used linear noise approximation, we rigorously show that, in the limit of slow promoter dynamics, these distributions can be systematically approximated as a mixture of Gaussian components in a wide class of networks. The resulting closed-form approximation provides a practical tool for studying complex nonlinear gene regulatory networks that have thus far been amenable only to stochastic simulation. We demonstrate the applicability of our approach in a number of genetic networks, uncovering previously unidentified dynamical characteristics associated with phenotypic switching. Specifically, we elucidate how the interplay of transcriptional and translational regulation can be exploited to control the multimodality of gene expression distributions in two-promoter networks. We demonstrate how phenotypic switching leads to birhythmical expression in a genetic oscillator, and to hysteresis in phenotypic induction, thus highlighting the ability of regulatory networks to retain memory. PMID:24782538

  12. Latent phenotypes pervade gene regulatory circuits

    PubMed Central

    2014-01-01

    Background Latent phenotypes are non-adaptive byproducts of adaptive phenotypes. They exist in biological systems as different as promiscuous enzymes and genome-scale metabolic reaction networks, and can give rise to evolutionary adaptations and innovations. We know little about their prevalence in the gene expression phenotypes of regulatory circuits, important sources of evolutionary innovations. Results Here, we study a space of more than sixteen million three-gene model regulatory circuits, where each circuit is represented by a genotype, and has one or more functions embodied in one or more gene expression phenotypes. We find that the majority of circuits with single functions have latent expression phenotypes. Moreover, the set of circuits with a given spectrum of functions has a repertoire of latent phenotypes that is much larger than that of any one circuit. Most of this latent repertoire can be easily accessed through a series of small genetic changes that preserve a circuit’s main functions. Both circuits and gene expression phenotypes that are robust to genetic change are associated with a greater number of latent phenotypes. Conclusions Our observations suggest that latent phenotypes are pervasive in regulatory circuits, and may thus be an important source of evolutionary adaptations and innovations involving gene regulation. PMID:24884746

  13. TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis

    PubMed Central

    Aerts, Stein; Van Loo, Peter; Thijs, Gert; Mayer, Herbert; de Martin, Rainer; Moreau, Yves; De Moor, Bart

    2005-01-01

    We present the second and improved release of the TOUCAN workbench for cis-regulatory sequence analysis. TOUCAN implements and integrates fast state-of-the-art methods and strategies in gene regulation bioinformatics, including algorithms for comparative genomics and for the detection of cis-regulatory modules. This second release of TOUCAN has become open source and thereby carries the potential to evolve rapidly. The main goal of TOUCAN is to allow a user to come to testable hypotheses regarding the regulation of a gene or of a set of co-regulated genes. TOUCAN can be launched from this location: . PMID:15980497

  14. Computational Genomics: From Genome Sequence To Global Gene Regulation

    NASA Astrophysics Data System (ADS)

    Li, Hao

    2000-03-01

    As various genome projects are shifting to the post-sequencing phase, it becomes a big challenge to analyze the sequence data and extract biological information using computational tools. In the past, computational genomics has mainly focused on finding new genes and mapping out their biological functions. With the rapid accumulation of experimental data on genome-wide gene activities, it is now possible to understand how genes are regulated on a genomic scale. A major mechanism for gene regulation is to control the level of transcription, which is achieved by regulatory proteins that bind to short DNA sequences - the regulatory elements. We have developed a new approach to identifying regulatory elements in genomes. The approach formalizes how one would proceed to decipher a ``text'' consisting of a long string of letters written in an unknown language that did not delineate words. The algorithm is based on a statistical mechanics model in which the sequence is segmented probabilistically into ``words'' and a ``dictionary'' of ``words'' is built concurrently. For the control regions in the yeast genome, we built a ``dictionary'' of about one thousand words which includes many known as well as putative regulatory elements. I will discuss how we can use this dictionary to search for genes that are likely to be regulated in a similar fashion and to analyze gene expression data generated from DNA micro-array experiments.

  15. Gene regulatory networks and the underlying biology of developmental toxicity

    EPA Science Inventory

    Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...

  16. Autonomous Boolean modeling of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Socolar, Joshua; Sun, Mengyang; Cheng, Xianrui

    2014-03-01

    In cases where the dynamical properties of gene regulatory networks are important, a faithful model must include three key features: a network topology; a functional response of each element to its inputs; and timing information about the transmission of signals across network links. Autonomous Boolean network (ABN) models are efficient representations of these elements and are amenable to analysis. We present an ABN model of the gene regulatory network governing cell fate specification in the early sea urchin embryo, which must generate three bands of distinct tissue types after several cell divisions, beginning from an initial condition with only two distinct cell types. Analysis of the spatial patterning problem and the dynamics of a network constructed from available experimental results reveals that a simple mechanism is at work in this case. Supported by NSF Grant DMS-10-68602

  17. Thermodynamics-based models of transcriptional regulation with gene sequence.

    PubMed

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  18. LFG: a candidate apoptosis regulatory gene family.

    PubMed

    Hu, Lan; Smith, Temple F; Goldberger, Gabriel

    2009-11-01

    The expanding wealth of human, model and other organism's genomic data has allowed the identification of a distinct gene family of apoptotic related genes. Most of these genes are currently unannotated or have been subsumed under two questionably related gene families in the past. For example the transmembrane Bax inhibitor 1 (BI1) motif family has been reported to play a role in apoptosis and to consist of at least seven mammalian protein genes, GRINA, BI1, Lfg/FAIM2, Ghitm, RESC1/Tmbim1, GAAP/Tmbim4, and Tmbm1b. However, a detailed sequence and phylogenetic analysis shows that only five of these form a clear and unique protein family. This now provides information for understanding and investigating the biological roles of these proteins across a wide range of tissues in model organisms. The evolutionary relationships among these genes provide a powerful prospective for extrapolating to human conditions.

  19. INCLUSive: a web portal and service registry for microarray and regulatory sequence analysis

    PubMed Central

    Coessens, Bert; Thijs, Gert; Aerts, Stein; Marchal, Kathleen; De Smet, Frank; Engelen, Kristof; Glenisson, Patrick; Moreau, Yves; Mathys, Janick; De Moor, Bart

    2003-01-01

    INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence models and Gibbs sampling. All tools are available via different web pages and as web services. The web pages are connected and integrated to reflect a methodology and facilitate complex analysis using different tools. The web services can be invoked using standard SOAP messaging. Example clients are available for download to invoke the services from a remote computer or to be integrated with other applications. All services are catalogued and described in a web service registry. The INCLUSive web portal is available for academic purposes at http://www.esat.kuleuven.ac.be/inclusive. PMID:12824346

  20. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    SciTech Connect

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A. . E-mail: BELL1@niehs.nih.gov

    2005-09-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes.

  1. Generation of oscillating gene regulatory network motifs

    NASA Astrophysics Data System (ADS)

    van Dorp, M.; Lannoo, B.; Carlon, E.

    2013-07-01

    Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.

  2. [Identification and mapping of cis-regulatory elements within long genomic sequences].

    PubMed

    Akopov, S B; Chernov, I P; Vetchinova, A S; Bulanenkova, S S; Nikolaev, L G

    2007-01-01

    The publication of the human and other metazoan genome sequences opened up the possibility for mapping and analysis of genomic regulatory elements. Unfortunately, experimental data on genomic positions of such sequences as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. As most genomic regulatory elements (e.g., enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements in silico is often ambiguous. Therefore, the development of high-throughput experimental approaches for identification and mapping of genomic functional elements is highly desirable. In this review we discuss novel approaches to high-throughput experimental identification of mammalian genomes cis-regulatory elements which is a necessary step toward the complete genome annotation. PMID:18240562

  3. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  4. Using qualitative probability in reverse-engineering gene regulatory networks.

    PubMed

    Ibrahim, Zina M; Ngom, Alioune; Tawfik, Ahmed Y

    2011-01-01

    This paper demonstrates the use of qualitative probabilistic networks (QPNs) to aid Dynamic Bayesian Networks (DBNs) in the process of learning the structure of gene regulatory networks from microarray gene expression data. We present a study which shows that QPNs define monotonic relations that are capable of identifying regulatory interactions in a manner that is less susceptible to the many sources of uncertainty that surround gene expression data. Moreover, we construct a model that maps the regulatory interactions of genetic networks to QPN constructs and show its capability in providing a set of candidate regulators for target genes, which is subsequently used to establish a prior structure that the DBN learning algorithm can use and which 1) distinguishes spurious correlations from true regulations, 2) enables the discovery of sets of coregulators of target genes, and 3) results in a more efficient construction of gene regulatory networks. The model is compared to the existing literature using the known gene regulatory interactions of Drosophila Melanogaster.

  5. Gene Regulatory Networks Elucidating Huanglongbing Disease Mechanisms

    PubMed Central

    Martinelli, Federico; Reagan, Russell L.; Uratsu, Sandra L.; Phu, My L.; Albrecht, Ute; Zhao, Weixiang; Davis, Cristina E.; Bowman, Kim D.; Dandekar, Abhaya M.

    2013-01-01

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein – protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes) would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation), sucrose metabolism (upregulation), and starch biosynthesis (upregulation). In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70) was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur. PMID:24086326

  6. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.

    PubMed

    Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques

    2008-01-01

    This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.

  7. Effects of gene regulatory reprogramming on gene expression in human and mouse developing hearts.

    PubMed

    Hsu, Chih-Hao; Ovcharenko, Ivan

    2013-01-01

    Lineage-specific regulatory elements underlie adaptation of species and play a role in disease susceptibility. We compared functionally conserved and lineage-specific enhancers by cross-mapping 5042 human and 6564 mouse heart enhancers. Of these, 79 per cent are lineage-specific, lacking a functional orthologue. Heart enhancers tend to cluster and, commonly, there are multiple heart enhancers in a heart locus providing a regulatory stability to the locus. We observed little cross-clustering, however, between lineage-specific and functionally conserved heart enhancers suggesting regulatory function acquisition and development in loci previously lacking heart activity. We also identified 862 human-specific heart enhancers: 417 featuring sequence conservation with mouse (class II) and 445 with neither sequence nor function conservation (class III). Ninety-eight per cent of class III enhancers were deleted from the mouse genome, and we estimated a similar-sized enhancer gain in the human lineage. Human-specific enhancers display no detectable decrease in the negative selection pressure and are strongly associated with genes partaking in the heart regulatory programmes. The loss of a heart enhancer could be compensated by activity of a redundant heart enhancer; however, we observed redundancy in only 15 per cent of class II and III enhancer loci indicating a large-scale reprogramming of the heart regulatory programme in mammals.

  8. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    SciTech Connect

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.; Loots, Gabriela G.; Houston, Kathryn A.; Dubchak, Inna; Speed, Terence P.; Rubin, Edward M.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs in gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.

  9. Human and mouse ABCA1 comparative sequencing and transgenesis studies identify regulatory elements

    SciTech Connect

    Qiu, Yang; Cavelier, L.; Chiu, Sally; Rubin, Edward; Cheng, Jan-Fang

    2000-08-01

    The expression of ABCA1, a major participant in apolipoprotein mediated cholesterol efflux is highly regulated by a variety of factors including intracellular cholesterol concentration. To analyze its genomic organization and identify those sequences involved in its regulation we sequenced and compared approximately 200 Kb of orthologous DNA from mice and humans containing the ABCA1 gene and significant flanking DNA. The comparison revealed a variety of mouse human conserved sequences including 50 conserved ABCA1 exons over 147Kb of human and 124Kb of mouse genomic DNA as well as multiple mouse human conserved noncoding sequences. Using as a criteria for identifying putative regulatory elements in non-coding sequence, human and mouse sequences that were &62;75% identical for over 120 bp were screened for resulting in the identification of 34 elements. The two most highly conserved human mouse noncoding elements (CNS1: 88% identity over 498 bp, CNS2: 81% identity over 214 bp)! were also highly conserved in the ABCA1 genes of rats, dogs, cows, rabbits and pigs. Two independent studies have demonstrated that the DNA segments containing CNS2 function in vitro as a sterol response promoter. Support for the inclusion of major ABCA1 regulatory elements in the human genomic sequence examined was the demonstration that mice containing a human BAC transgene containing sequences exclusively from the analyzed interval, expressed human ABCA1 in a tissue distribution mimicking expression of endogenous mouse ABC1. These studies using a comparative genomic approach has characterized the structure of the human and mouse ABCA1 genes and has helped identify sequences participating in its expression.

  10. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  11. Regulatory gene networks and the properties of the developmental process

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; McClay, David R.; Hood, Leroy

    2003-01-01

    Genomic instructions for development are encoded in arrays of regulatory DNA. These specify large networks of interactions among genes producing transcription factors and signaling components. The architecture of such networks both explains and predicts developmental phenomenology. Although network analysis is yet in its early stages, some fundamental commonalities are already emerging. Two such are the use of multigenic feedback loops to ensure the progressivity of developmental regulatory states and the prevalence of repressive regulatory interactions in spatial control processes. Gene regulatory networks make it possible to explain the process of development in causal terms and eventually will enable the redesign of developmental regulatory circuitry to achieve different outcomes.

  12. Patterns of sequence conservation in presynaptic neural genes

    PubMed Central

    Hadley, Dexter; Murphy, Tara; Valladares, Otto; Hannenhalli, Sridhar; Ungar, Lyle; Kim, Junhyong; Bućan, Maja

    2006-01-01

    Background The neuronal synapse is a fundamental functional unit in the central nervous system of animals. Because synaptic function is evolutionarily conserved, we reasoned that functional sequences of genes and related genomic elements known to play important roles in neurotransmitter release would also be conserved. Results Evolutionary rate analysis revealed that presynaptic proteins evolve slowly, although some members of large gene families exhibit accelerated evolutionary rates relative to other family members. Comparative sequence analysis of 46 megabases spanning 150 presynaptic genes identified more than 26,000 elements that are highly conserved in eight vertebrate species, as well as a small subset of sequences (6%) that are shared among unrelated presynaptic genes. Analysis of large gene families revealed that upstream and intronic regions of closely related family members are extremely divergent. We also identified 504 exceptionally long conserved elements (≥360 base pairs, ≥80% pair-wise identity between human and other mammals) in intergenic and intronic regions of presynaptic genes. Many of these elements form a highly stable stem-loop RNA structure and consequently are candidates for novel regulatory elements, whereas some conserved noncoding elements are shown to correlate with specific gene expression profiles. The SynapseDB online database integrates these findings and other functional genomic resources for synaptic genes. Conclusion Highly conserved elements in nonprotein coding regions of 150 presynaptic genes represent sequences that may be involved in the transcriptional or post-transcriptional regulation of these genes. Furthermore, comparative sequence analysis will facilitate selection of genes and noncoding sequences for future functional studies and analysis of variation studies in neurodevelopmental and psychiatric disorders. PMID:17096848

  13. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar; Otim, Ochan; Brown, C. Titus; Livi, Carolina B.; Lee, Pei Yun; Revilla, Roger; Schilstra, Maria J.; Clarke, Peter J C.; Rust, Alistair G.; Pan, Zhengjun; Arnone, Maria I.; Rowen, Lee; Cameron, R. Andrew; McClay, David R.; Hood, Leroy; Bolouri, Hamid

    2002-01-01

    We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

  14. Phylogenetic structure and evolution of regulatory genes and integrases of P2-like phages.

    PubMed

    Nilsson, Hanna; Cardoso-Palacios, Carlos; Haggård-Ljungquist, Elisabeth; Nilsson, Anders S

    2011-07-01

    The phylogenetic relationships and structural similarities of the proteins encoded within the regulatory region (containing the integrase gene and the lytic-lysogenic transcriptional switch genes) of P2-like phages were analyzed, and compared with the phylogenetic relationship of P2-like phages inferred from four structural genes. P2-like phages are thought to be one of the most genetically homogenous phage groups but the regulatory region nevertheless varies extensively between different phage genomes.   The analyses showed that there are many types of regulatory regions, but two types can be clearly distinguished; regions similar either to the phage P2 or to the phage 186 regulatory regions. These regions were also found to be most frequent among the sequenced P2-like phage or prophage genomes, and common in phages using Escherichia coli as a host. Both the phylogenetic and the structural analyses showed that these two regions are related. The integrases as well as the cox/apl genes show a common monophyletic origin but the immunity repressor genes, the type P2 C gene and the type 186 cI gene, are likely of different origin. There was no indication of recombination between the P2-186 types of regulatory genes but the comparison of the phylogenies of the regulatory region with the phylogeny based on four structural genes revealed recombinational events between the regulatory region and the structural genes. Less common regulatory regions were phylogenetically heterogeneous and typically contained a fusion of genes from distantly related or unknown phages and P2-like genes.

  15. Phylogenetic structure and evolution of regulatory genes and integrases of P2-like phages

    PubMed Central

    Nilsson, Hanna; Cardoso-Palacios, Carlos; Haggård-Ljungquist, Elisabeth; Nilsson, Anders S.

    2011-01-01

    The phylogenetic relationships and structural similarities of the proteins encoded within the regulatory region (containing the integrase gene and the lytic–lysogenic transcriptional switch genes) of P2-like phages were analyzed, and compared with the phylogenetic relationship of P2-like phages inferred from four structural genes. P2-like phages are thought to be one of the most genetically homogenous phage groups but the regulatory region nevertheless varies extensively between different phage genomes.   The analyses showed that there are many types of regulatory regions, but two types can be clearly distinguished; regions similar either to the phage P2 or to the phage 186 regulatory regions. These regions were also found to be most frequent among the sequenced P2-like phage or prophage genomes, and common in phages using Escherichia coli as a host. Both the phylogenetic and the structural analyses showed that these two regions are related. The integrases as well as the cox/apl genes show a common monophyletic origin but the immunity repressor genes, the type P2 C gene and the type 186 cI gene, are likely of different origin. There was no indication of recombination between the P2–186 types of regulatory genes but the comparison of the phylogenies of the regulatory region with the phylogeny based on four structural genes revealed recombinational events between the regulatory region and the structural genes. Less common regulatory regions were phylogenetically heterogeneous and typically contained a fusion of genes from distantly related or unknown phages and P2-like genes. PMID:23050214

  16. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

    PubMed Central

    Sîrbu, Alina; Crane, Martin; Ruskin, Heather J.

    2015-01-01

    Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  17. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  18. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  19. Predicting differences in gene regulatory systems by state space models.

    PubMed

    Yamaguchi, Rui; Imoto, Seiya; Yamauchi, Mai; Nagasaki, Masao; Yoshida, Ryo; Shimamura, Teppei; Hatanaka, Yosuke; Ueno, Kazuko; Higuchi, Tomoyuki; Gotoh, Noriko; Miyano, Satoru

    2008-01-01

    We propose a statistical strategy to predict differentially regulated genes of case and control samples from time-course gene expression data by leveraging unpredictability of the expression patterns from the underlying regulatory system inferred by a state space model. The proposed method can screen out genes that show different patterns but generated by the same regulations in both samples, since these patterns can be predicted by the same model. Our strategy consists of three steps. Firstly, a gene regulatory system is inferred from the control data by a state space model. Then the obtained model for the underlying regulatory system of the control sample is used to predict the case data. Finally, by assessing the significance of the difference between case and predicted-case time-course data of each gene, we are able to detect the unpredictable genes that are the candidate as the key differences between the regulatory systems of case and control cells. We illustrate the whole process of the strategy by an actual example, where human small airway epithelial cell gene regulatory systems were generated from novel time courses of gene expressions following treatment with(case)/without(control) the drug gefitinib, an inhibitor for the epidermal growth factor receptor tyrosine kinase. Finally, in gefitinib response data we succeeded in finding unpredictable genes that are candidates of the specific targets of gefitinib. We also discussed differences in regulatory systems for the unpredictable genes. The proposed method would be a promising tool for identifying biomarkers and drug target genes.

  20. Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features

    PubMed Central

    Mohammad-Noori, Morteza; Beer, Michael A.

    2014-01-01

    Abstract Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem. PMID:25033408

  1. Evolution of the mammalian embryonic pluripotency gene regulatory network.

    PubMed

    Fernandez-Tresguerres, Beatriz; Cañon, Susana; Rayon, Teresa; Pernaute, Barbara; Crespo, Miguel; Torroja, Carlos; Manzanares, Miguel

    2010-11-16

    Embryonic pluripotency in the mouse is established and maintained by a gene-regulatory network under the control of a core set of transcription factors that include octamer-binding protein 4 (Oct4; official name POU domain, class 5, transcription factor 1, Pou5f1), sex-determining region Y (SRY)-box containing gene 2 (Sox2), and homeobox protein Nanog. Although this network is largely conserved in eutherian mammals, very little information is available regarding its evolutionary conservation in other vertebrates. We have compared the embryonic pluripotency networks in mouse and chick by means of expression analysis in the pregastrulation chicken embryo, genomic comparisons, and functional assays of pluripotency-related regulatory elements in ES cells and blastocysts. We find that multiple components of the network are either novel to mammals or have acquired novel expression domains in early developmental stages of the mouse. We also find that the downstream action of the mouse core pluripotency factors is mediated largely by genomic sequence elements nonconserved with chick. In the case of Sox2 and Fgf4, we find that elements driving expression in embryonic pluripotent cells have evolved by a small number of nucleotide changes that create novel binding sites for core factors. Our results show that the network in charge of embryonic pluripotency is an evolutionary novelty of mammals that is related to the comparatively extended period during which mammalian embryonic cells need to be maintained in an undetermined state before engaging in early differentiation events.

  2. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  3. Deduced products of C4-dicarboxylate transport regulatory genes of Rhizobium leguminosarum are homologous to nitrogen regulatory gene products.

    PubMed Central

    Ronson, C W; Astwood, P M; Nixon, B T; Ausubel, F M

    1987-01-01

    We have sequenced two genes dctB and dctD required for the activation of the C4-dicarboxylate transport structural gene dctA in free-living Rhizobium leguminosarum. The hydropathic profile of the dctB gene product (DctB) suggested that its N-terminal region may be located in the periplasm and its C-terminal region in the cytoplasm. The C-terminal region of DctB was strongly conserved with similar regions of the products of several regulatory genes that may act as environmental sensors, including ntrB, envZ, virA, phoR, cpxA, and phoM. The N-terminal domains of the products of several regulatory genes thought to be transcriptional activators, including ntrC, ompR, virG, phoB and sfrA. In addition, the central and C-terminal regions of DctD were strongly conserved with the products of ntrC and nifA, transcriptional activators that require the alternate sigma factor rpoN (ntrA) as co-activator. The central region of DctD also contained a potential ATP-binding domain. These results are consistent with recent results that show that rpoN product is required for dctA activation, and suggest that DctB plus DctD-mediated transcriptional activation of dctA may be mechanistically similar to NtrB plus NtrC-mediated activation of glnA in E. coli. PMID:3671068

  4. Regulatory regions of two transport operons under nitrogen control: nucleotide sequences.

    PubMed Central

    Higgins, C F; Ames, G F

    1982-01-01

    We have determined the nucleotide sequences of the regulatory regions from two amino acid transport operons from Salmonella typhimurium: dhuA, which regulates the histidine transport operon, and argTr, which regulates argT, the gene encoding the lysine-arginine-ornithine-binding protein, LAO. The promoter for the histidine transport operon has been identified from the sequence change in the promoter-up mutation dhuA1. Neither regulatory region has any of the features typical of the regulatory regions of the amino acid biosynthetic operons, indicating that regulation of at least these transport genes does not involve a transcription attenuation mechanism. We have identified three interesting features, present in both of these sequences, which may be of importance in the regulation of these and other operons: a "stem-loop-foot" structure, a region of specific homology, and a mirror symmetry. The region of mirror symmetry may be a protein recognition site important is regulating expression of these and other operons in response to nitrogen availability. Mirror symmetry as a structure for DNA-protein interaction sites has not been proposed previously. PMID:7041112

  5. Interplay between gene expression noise and regulatory network architecture

    PubMed Central

    Chalancon, Guilhem; Ravarani, Charles; Balaji, S.; Martinez-Arias, Alfonso; Aravind, L.; Jothi, Raja; Babu, M. Madan

    2012-01-01

    Complex regulatory networks orchestrate most cellular processes in biological systems. Genes in such networks are subject to expression noise, resulting in isogenic cell populations exhibiting cell-to-cell variation in protein levels. Increasing evidence suggests that cells have evolved regulatory strategies to limit, tolerate, or amplify expression noise. In this context, fundamental questions arise: how can the architecture of gene regulatory networks generate, make use of, or be constrained by expression noise? Here, we discuss the interplay between expression noise and gene regulatory network at different levels of organization, ranging from a single regulatory interaction to entire regulatory networks. We then consider how this interplay impacts a variety of phenomena such as pathogenicity, disease, adaptation to changing environments, differential cell-fate outcome and incomplete or partial penetrance effects. Finally, we highlight recent technological developments that permit measurements at the single-cell level, and discuss directions for future research. PMID:22365642

  6. Preservation of Gene Duplication Increases the Regulatory Spectrum of Ribosomal Protein Genes and Enhances Growth under Stress.

    PubMed

    Parenteau, Julie; Lavoie, Mathieu; Catala, Mathieu; Malik-Ghulam, Mustafa; Gagnon, Jules; Abou Elela, Sherif

    2015-12-22

    In baker's yeast, the majority of ribosomal protein genes (RPGs) are duplicated, and it was recently proposed that such duplications are preserved via the functional specialization of the duplicated genes. However, the origin and nature of duplicated RPGs' (dRPGs) functional specificity remain unclear. In this study, we show that differences in dRPG functions are generated by variations in the modality of gene expression and, to a lesser extent, by protein sequence. Analysis of the sequence and expression patterns of non-intron-containing RPGs indicates that each dRPG is controlled by specific regulatory sequences modulating its expression levels in response to changing growth conditions. Homogenization of dRPG sequences reduces cell tolerance to growth under stress without changing the number of expressed genes. Together, the data reveal a model where duplicated genes provide a means for modulating the expression of ribosomal proteins in response to stress. PMID:26686636

  7. Chaotic motifs in gene regulatory networks.

    PubMed

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs.

  8. Genomic aberrations frequently alter chromatin regulatory genes in chordoma.

    PubMed

    Wang, Lu; Zehir, Ahmet; Nafa, Khedoudja; Zhou, Nengyi; Berger, Michael F; Casanova, Jacklyn; Sadowska, Justyna; Lu, Chao; Allis, C David; Gounder, Mrinal; Chandhanayingyong, Chandhanarat; Ladanyi, Marc; Boland, Patrick J; Hameed, Meera

    2016-07-01

    Chordoma is a rare primary bone neoplasm that is resistant to standard chemotherapies. Despite aggressive surgical management, local recurrence and metastasis is not uncommon. To identify the specific genetic aberrations that play key roles in chordoma pathogenesis, we utilized a genome-wide high-resolution SNP-array and next generation sequencing (NGS)-based molecular profiling platform to study 24 patient samples with typical histopathologic features of chordoma. Matching normal tissues were available for 16 samples. SNP-array analysis revealed nonrandom copy number losses across the genome, frequently involving 3, 9p, 1p, 14, 10, and 13. In contrast, copy number gain is uncommon in chordomas. Two minimum deleted regions were observed on 3p within a ∼8 Mb segment at 3p21.1-p21.31, which overlaps SETD2, BAP1 and PBRM1. The minimum deleted region on 9p was mapped to CDKN2A locus at 9p21.3, and homozygous deletion of CDKN2A was detected in 5/22 chordomas (∼23%). NGS-based molecular profiling demonstrated an extremely low level of mutation rate in chordomas, with an average of 0.5 mutations per sample for the 16 cases with matched normal. When the mutated genes were grouped based on molecular functions, many of the mutation events (∼40%) were found in chromatin regulatory genes. The combined copy number and mutation profiling revealed that SETD2 is the single gene affected most frequently in chordomas, either by deletion or by mutations. Our study demonstrated that chordoma belongs to the C-class (copy number changes) tumors whose oncogenic signature is non-random multiple copy number losses across the genome and genomic aberrations frequently alter chromatin regulatory genes. © 2016 Wiley Periodicals, Inc.

  9. Genomic Aberrations Frequently Alter Chromatin Regulatory Genes in Chordoma

    PubMed Central

    Wang, Lu; Zehir, Ahmet; Nafa, Khedoudja; Zhou, Nengyi; Berger, Michael F.; Casanova, Jacklyn; Sadowska, Justyna; Lu, Chao; Allis, C. David; Gounder, Mrinal; Chandhanayingyong, Chandhanarat; Ladanyi, Marc; Boland, Patrick J; Hameed, Meera

    2016-01-01

    Chordoma is a rare primary bone neoplasm that is resistant to standard chemotherapies. Despite aggressive surgical management, local recurrence and metastasis is not uncommon. To identify the specific genetic aberrations that play key roles in chordoma pathogenesis, we utilized a genome-wide high-resolution SNP-array and next generation sequencing (NGS)-based molecular profiling platform to study 24 patient samples with typical histopathologic features of chordoma. Matching normal tissues were available for 16 samples. SNP-array analysis revealed nonrandom copy number losses across the genome, frequently involving 3, 9p, 1p, 14, 10, and 13. In contrast, copy number gain is uncommon in chordomas. Two minimum deleted regions were observed on 3p within a ~8 Mb segment at 3p21.1–p21.31, which overlaps SETD2, BAP1 and PBRM1. The minimum deleted region on 9p was mapped to CDKN2A locus at 9p21.3, and homozygous deletion of CDKN2A was detected in 5/22 chordomas (~23%). NGS-based molecular profiling demonstrated an extremely low level of mutation rate in chordomas, with an average of 0.5 mutations per sample for the 16 cases with matched normal. When the mutated genes were grouped based on molecular functions, many of the mutation events (~40%) were found in chromatin regulatory genes. The combined copy number and mutation profiling revealed that SETD2 is the single gene affected most frequently in chordomas, either by deletion or by mutations. Our study demonstrated that chordoma belongs to the C-class (copy number changes) tumors whose oncogenic signature is non-random multiple copy number losses across the genome and genomic aberrations frequently alter chromatin regulatory genes. PMID:27072194

  10. Identification of Developmental Regulatory Genes in Aspergillus Nidulans by Overexpression

    PubMed Central

    Marhoul, J. F.; Adams, T. H.

    1995-01-01

    Overexpression of several Aspergillus nidulans developmental regulatory genes has been shown to cause growth inhibition and development at inappropriate times. We set out to identify previously unknown developmental regulators by constructing a nutritionally inducible A. nidulans expression library containing small, random genomic DNA fragments inserted next to the alcA promoter [ alcA (p) ] in an A. nidulans transformation vector. Among 20,000 transformants containing random alcA (p) genomic DNA fusion constructs, we identified 66 distinct mutant strains in which alcA (p) induction resulted in growth inhibition as well as causing other detectable phenotypic changes. These growth inhibited mutants were divided into 52 FIG (Forced expression Inhibition of Growth) and 14 FAB (Forced expression Activation of brlA) mutants based on whether or not alcA (p) induction resulted in accumulation of mRNA for the developmental regulatory gene brlA. In four FAB mutants, alcA (p) induction not only activated brlA expression but also caused hyphae to differentiate into reduced conidiophores that produced viable spores from the tips as is observed after alcA (p) :: brlA induction. Sequence analyses of the DNA fragments under alcA (p) control in three of these four sporulating strains showed that in two cases developmental activation resulted from overexpression of previously uncharacterized genes, whereas in the third strain, the alcA (p) was fused to brlA. The potential uses for this strategy in identifying genes whose overexpression results in specific phenotypic changes like developmental induction are discussed. PMID:7713416

  11. Identification of developmental regulatory genes in Aspergillus nidulans by overexpression.

    PubMed

    Marhoul, J F; Adams, T H

    1995-02-01

    Overexpression of several Aspergillus nidulans developmental regulatory genes has been shown to cause growth inhibition and development at inappropriate times. We set out to identify previously unknown developmental regulators by constructing a nutritionally inducible A. nidulans expression library containing small, random genomic DNA fragments inserted next to the alcA promoter [alcA(p)] in an A. nidulans transformation vector. Among 20,000 transformants containing random alcA(p) genomic DNA fusion constructs, we identified 66 distinct mutant strains in which alcA(p) induction resulted in growth inhibition as well as causing other detectable phenotypic changes. These growth inhibited mutants were divided into 52 FIG (Forced expression Inhibition of Growth) and 14 FAB (Forced expression Activation of brlA) mutants based on whether or not alcA(p) induction resulted in accumulation of mRNA for the developmental regulatory gene brlA. In four FAB mutants, alcA(p) induction not only activated brlA expression but also caused hyphae to differentiate into reduced conidiophores that produced viable spores from the tips as is observed after alcA(p)::brlA induction. Sequence analyses of the DNA fragments under alcA(p) control in three of these four sporulating strains showed that in two cases developmental activation resulted from overexpression of previously uncharacterized genes, whereas in the third strain, the alcA(p) was fused to brlA. The potential uses for this strategy in identifying genes whose overexpression results in specific phenotypic changes like developmental induction are discussed.

  12. Computational discovery of gene modules and regulatory networks.

    PubMed

    Bar-Joseph, Ziv; Gerber, Georg K; Lee, Tong Ihn; Rinaldi, Nicola J; Yoo, Jane Y; Robert, François; Gordon, D Benjamin; Fraenkel, Ernest; Jaakkola, Tommi S; Young, Richard A; Gifford, David K

    2003-11-01

    We describe an algorithm for discovering regulatory networks of gene modules, GRAM (Genetic Regulatory Modules), that combines information from genome-wide location and expression data sets. A gene module is defined as a set of coexpressed genes to which the same set of transcription factors binds. Unlike previous approaches that relied primarily on functional information from expression data, the GRAM algorithm explicitly links genes to the factors that regulate them by incorporating DNA binding data, which provide direct physical evidence of regulatory interactions. We use the GRAM algorithm to describe a genome-wide regulatory network in Saccharomyces cerevisiae using binding information for 106 transcription factors profiled in rich medium conditions data from over 500 expression experiments. We also present a genome-wide location analysis data set for regulators in yeast cells treated with rapamycin, and use the GRAM algorithm to provide biological insights into this regulatory network

  13. Discovering Transcriptional Modules by Combined Analysis of Expression Profiles and Regulatory Sequences

    NASA Astrophysics Data System (ADS)

    Halperin, Yonit; Linhart, Chaim; Ulitsky, Igor; Shamir, Ron

    A key goal of gene expression analysis is the characterization of transcription factors (TFs) and micro-RNAs (miRNAs) regulating specific transcriptional programs. The most common approach to address this task is a two-step methodology: In the first step, a clustering procedure is executed to partition the genes into groups that are believed to be co-regulated, based on expression profile similarity. In the second step, a motif discovery tool is applied to search for over-represented cis-regulatory motifs within each group. In an effort to obtain better results by simultaneously utilizing all available information, several studies have suggested computational schemes for a single-step combined analysis of expression and sequence data. Despite extensive research, reverse engineering complex regulatory networks from microarray measurements remains a difficult challenge with limited success, especially in metazoans.

  14. Regulatory regions in the yeast FBP1 and PCK1 genes.

    PubMed

    Mercado, J J; Gancedo, J M

    1992-10-19

    By deletion analysis of the fusion genes FBP1-lacZ and PCK1-lacZ we have identified a number of strong regulatory regions in the genes FBP1 and PCK1 which encode fructose-1,6-bisphosphatase and phosphoenolpyruvate carboxykinase. Lack of expression of beta-galactosidase in fusions lacking sequences from the coding regions suggests the existence of downstream activating elements. Both promoters have several UAS and URS regions as well as sites implicated in catabolite repression. We have found in both genes consensus sequences for the binding of the same regulatory proteins, such as yAP1, MIG1 or the complex HAP2/HAP3/HAP4. Neither deletion nor overexpression of the MIG1 gene affected the regulated expression of the FBP1 or PCK1 genes.

  15. The structure of the human peripherin gene (PRPH) and identification of potential regulatory elements

    SciTech Connect

    Foley, J.; Ley, C.A.; Parysek, L.M.

    1994-07-15

    The authors determined the complete nucleotide sequence of the coding region of the human peripherin gene (PRPH), as well as 742 bp 5{prime} to the cap site and 584 bp 3{prime} to the stop codon, and compared its structure and sequence to the rat and mouse genes. The overall structure of 9 exons separated by 8 introns is conserved among these three mammalian species. The nucleotide sequences of the human peripherin gene exons were 90% identical to the rat gene sequences, and the predicted human peripherin protein differed from rat peripherin at only 18 of 475 amino acid residues. Comparison of the 5{prime} flanking regions of the human peripherin gene and rodent genes revealed extensive areas of high homology. Additional conserved segments were found in introns 1 and 2. Within the 5{prime} region, potential regulatory sequences, including a nerve growth factor negative regulatory element, a Hox protein binding site, and a heat shock element, were identified in all peripherin genes. The positional conservation of each element suggests that they may be important in the tissue-specific, developmental-specific, and injury-specific expression of the peripherin gene. 24 refs., 2 figs., 1 tab.

  16. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations.

    PubMed

    Wang, Junbai; Batmanov, Kirill

    2015-12-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  17. An organ boundary-enriched gene regulatory network uncovers regulatory hierarchies underlying axillary meristem initiation

    PubMed Central

    Tian, Caihuan; Zhang, Xiaoni; He, Jun; Yu, Haopeng; Wang, Ying; Shi, Bihai; Han, Yingying; Wang, Guoxun; Feng, Xiaoming; Zhang, Cui; Wang, Jin; Qi, Jiyan; Yu, Rong; Jiao, Yuling

    2014-01-01

    Gene regulatory networks (GRNs) control development via cell type-specific gene expression and interactions between transcription factors (TFs) and regulatory promoter regions. Plant organ boundaries separate lateral organs from the apical meristem and harbor axillary meristems (AMs). AMs, as stem cell niches, make the shoot a ramifying system. Although AMs have important functions in plant development, our knowledge of organ boundary and AM formation remains rudimentary. Here, we generated a cellular-resolution genomewide gene expression map for low-abundance Arabidopsis thaliana organ boundary cells and constructed a genomewide protein–DNA interaction map focusing on genes affecting boundary and AM formation. The resulting GRN uncovers transcriptional signatures, predicts cellular functions, and identifies promoter hub regions that are bound by many TFs. Importantly, further experimental studies determined the regulatory effects of many TFs on their targets, identifying regulators and regulatory relationships in AM initiation. This systems biology approach thus enhances our understanding of a key developmental process. PMID:25358340

  18. Phenotype accessibility and noise in random threshold gene regulatory networks.

    PubMed

    Pinho, Ricardo; Garcia, Victor; Feldman, Marcus W

    2014-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  19. Phenotype Accessibility and Noise in Random Threshold Gene Regulatory Networks

    PubMed Central

    Feldman, Marcus W.

    2015-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  20. Gene regulatory network analysis in sea urchin embryos.

    PubMed

    Oliveri, Paola; Davidson, Eric H

    2004-01-01

    It may safely be predicted that GRN analysis will become increasingly important. It will come to underlie the causal study of development, the major effort underway to understand the regulatory code built into animal genomes and also the evolution of these genomes. Partly by serendipity, sea urchin embryos turn out to be a superb experimental material for GRN analysis. Their natural properties have, in turn, influenced the predilections of those who work on them, and between them and us, so to speak, this is now a developmental system of which we are rapidly gaining an unusually complete understanding. The causal linkages that control development of the whole embryo will be revealed, leading all the way from the heritable genomic regulatory code to the events of embryology. The fundamental experimental operation is the perturbation analysis: Here is where causality permeates the exploration. We have in this chapter summarized in some detail the requirements for perturbation GRN analysis in sea urchin embryos. But that is not all, nor is it enough to enable the assembly of a GRN: What is required is the combined application of elegant computational methods, of gene regulation molecular biology, of genomic sequence data, and of experimental embryology. As the results crystallize together, we can begin to see how far this powerful combination of methods and ideas is going to carry us. PMID:15575631

  1. Using SCOPE to identify potential regulatory motifs in coregulated genes.

    PubMed

    Martyanov, Viktor; Gross, Robert H

    2011-05-31

    SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from

  2. Robustness and Accuracy in Sea Urchin Developmental Gene Regulatory Networks

    PubMed Central

    Ben-Tabou de-Leon, Smadar

    2016-01-01

    Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change. PMID:26913048

  3. Genome-wide identification of regulatory elements and reconstruction of gene regulatory networks of the green alga Chlamydomonas reinhardtii under carbon deprivation.

    PubMed

    Winck, Flavia Vischi; Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

    2013-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can

  4. Comparison of five major trichome regulatory genes in Brassica villosa with orthologues within the Brassicaceae.

    PubMed

    Nayidu, Naghabushana K; Kagale, Sateesh; Taheri, Ali; Withana-Gamage, Thushan S; Parkin, Isobel A P; Sharpe, Andrew G; Gruber, Margaret Y

    2014-01-01

    Coding sequences for major trichome regulatory genes, including the positive regulators GLABRA 1(GL1), GLABRA 2 (GL2), ENHANCER OF GLABRA 3 (EGL3), and TRANSPARENT TESTA GLABRA 1 (TTG1) and the negative regulator TRIPTYCHON (TRY), were cloned from wild Brassica villosa, which is characterized by dense trichome coverage over most of the plant. Transcript (FPKM) levels from RNA sequencing indicated much higher expression of the GL2 and TTG1 regulatory genes in B. villosa leaves compared with expression levels of GL1 and EGL3 genes in either B. villosa or the reference genome species, glabrous B. oleracea; however, cotyledon TTG1 expression was high in both species. RNA sequencing and Q-PCR also revealed an unusual expression pattern for the negative regulators TRY and CPC, which were much more highly expressed in trichome-rich B. villosa leaves than in glabrous B. oleracea leaves and in glabrous cotyledons from both species. The B. villosa TRY expression pattern also contrasted with TRY expression patterns in two diploid Brassica species, and with the Arabidopsis model for expression of negative regulators of trichome development. Further unique sequence polymorphisms, protein characteristics, and gene evolution studies highlighted specific amino acids in GL1 and GL2 coding sequences that distinguished glabrous species from hairy species and several variants that were specific for each B. villosa gene. Positive selection was observed for GL1 between hairy and non-hairy plants, and as expected the origin of the four expressed positive trichome regulatory genes in B. villosa was predicted to be from B. oleracea. In particular the unpredicted expression patterns for TRY and CPC in B. villosa suggest additional characterization is needed to determine the function of the expanded families of trichome regulatory genes in more complex polyploid species within the Brassicaceae.

  5. Experimental approaches for gene regulatory network construction: the chick as a model system

    PubMed Central

    Streit, Andrea; Tambalo, Monica; Chen, Jingchen; Grocott, Timothy; Anwar, Maryam; Sosinsky, Alona; Stern, Claudio D.

    2012-01-01

    Setting up the body plan during embryonic development requires the coordinated action of many signals and transcriptional regulators in a precise temporal sequence and spatial pattern. The last decades have seen an explosion of information describing the molecular control of many developmental processes. The next challenge is to integrate this information into logic ‘wiring diagrams’ that visualise gene actions and outputs, have predictive power and point to key control nodes. Here we provide an experimental workflow on how to construct gene regulatory networks using the chick as model system. Keywords: transcription factors, transcriptome analysis, conserved regulatory elements PMID:23174848

  6. Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

    PubMed Central

    Noble, Luke M.; Andrianopoulos, Alex

    2013-01-01

    Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226

  7. Nemertean toxin genes revealed through transcriptome sequencing.

    PubMed

    Whelan, Nathan V; Kocot, Kevin M; Santos, Scott R; Halanych, Kenneth M

    2014-11-27

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63-74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins.

  8. DNA sequence and translational product of a new nodulation-regulatory locus: syrM has sequence similarity to NodD proteins.

    PubMed Central

    Barnett, M J; Long, S R

    1990-01-01

    Rhizobium meliloti nodulation (nod) genes are expressed when activated by trans-acting proteins in the NodD family. The nodD1 and nodD2 gene products activate nod promoters when cells are exposed to plant-synthesized signal molecules. Alternatively, the same nod promoters are activated by the nodD3 gene when nodD3 is carried in trans along with a closely linked global regulatory locus, syrM (symbiotic regulator) (J. T. Mulligan and S. R. Long, Genetics 122:7-18, 1989). In this article we report the nucleotide sequence of a 2.6-kilobase SphI fragment from R. meliloti SU47 containing syrM. Expression from this locus was confirmed by using in vitro transcription-translation assays. The open reading frame encoded a protein of either 33 or 36 kilodaltons whose sequence shows similarity to NodD regulatory proteins. Images PMID:2361944

  9. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    PubMed Central

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2005-01-01

    We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085

  10. The molecular and gene regulatory signature of a neuron

    PubMed Central

    Hobert, Oliver; Carrera, Inés; Stefanakis, Nikolaos

    2010-01-01

    Neuron-type specific gene batteries define the morphological and functional diversity of cell types in the nervous system. Here, we discuss the composition of neuron-type specific gene batteries and illustrate gene regulatory strategies employed by distinct organisms from C.elegans to higher vertebrates, which are instrumental in determining the unique gene expression profile and molecular composition of individual neuronal cell types. Based on principles learned from prokaryotic gene regulation, we argue that neuronal, terminal gene batteries are functionally grouped into parallel acting “regulons”. The theoretical concepts discussed here provide testable hypotheses for future experimental analysis into the exact gene regulatory mechanisms that are employed in the generation of neuronal diversity and identity. PMID:20663572

  11. Gene regulatory networks modelling using a dynamic evolutionary hybrid

    PubMed Central

    2010-01-01

    Background Inference of gene regulatory networks is a key goal in the quest for understanding fundamental cellular processes and revealing underlying relations among genes. With the availability of gene expression data, computational methods aiming at regulatory networks reconstruction are facing challenges posed by the data's high dimensionality, temporal dynamics or measurement noise. We propose an approach based on a novel multi-layer evolutionary trained neuro-fuzzy recurrent network (ENFRN) that is able to select potential regulators of target genes and describe their regulation type. Results The recurrent, self-organizing structure and evolutionary training of our network yield an optimized pool of regulatory relations, while its fuzzy nature avoids noise-related problems. Furthermore, we are able to assign scores for each regulation, highlighting the confidence in the retrieved relations. The approach was tested by applying it to several benchmark datasets of yeast, managing to acquire biologically validated relations among genes. Conclusions The results demonstrate the effectiveness of the ENFRN in retrieving biologically valid regulatory relations and providing meaningful insights for better understanding the dynamics of gene regulatory networks. The algorithms and methods described in this paper have been implemented in a Matlab toolbox and are available from: http://bioserver-1.bioacademy.gr/DataRepository/Project_ENFRN_GRN/. PMID:20298548

  12. Time-Delayed Models of Gene Regulatory Networks

    PubMed Central

    Parmar, K.; Blyuss, K. B.; Kyrychko, Y. N.; Hogan, S. J.

    2015-01-01

    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems. PMID:26576197

  13. Systems Approaches to Identifying Gene Regulatory Networks in Plants

    PubMed Central

    Long, Terri A.; Brady, Siobhan M.; Benfey, Philip N.

    2009-01-01

    Complex gene regulatory networks are composed of genes, noncoding RNAs, proteins, metabolites, and signaling components. The availability of genome-wide mutagenesis libraries; large-scale transcriptome, proteome, and metabalome data sets; and new high-throughput methods that uncover protein interactions underscores the need for mathematical modeling techniques that better enable scientists to synthesize these large amounts of information and to understand the properties of these biological systems. Systems biology approaches can allow researchers to move beyond a reductionist approach and to both integrate and comprehend the interactions of multiple components within these systems. Descriptive and mathematical models for gene regulatory networks can reveal emergent properties of these plant systems. This review highlights methods that researchers are using to obtain large-scale data sets, and examples of gene regulatory networks modeled with these data. Emergent properties revealed by the use of these network models and perspectives on the future of systems biology are discussed. PMID:18616425

  14. Bayesian Nonlinear Model Selection for Gene Regulatory Networks

    PubMed Central

    Ni, Yang; Stingo, Francesco C.; Baladandayuthapani, Veerabhadran

    2015-01-01

    Summary Gene regulatory networks represent the regulatory relationships between genes and their products and are important for exploring and defining the underlying biological processes of cellular systems. We develop a novel framework to recover the structure of nonlinear gene regulatory networks using semiparametric spline-based directed acyclic graphical models. Our use of splines allows the model to have both flexibility in capturing nonlinear dependencies as well as control of overfitting via shrinkage, using mixed model representations of penalized splines. We propose a novel discrete mixture prior on the smoothing parameter of the splines that allows for simultaneous selection of both linear and nonlinear functional relationships as well as inducing sparsity in the edge selection. Using simulation studies, we demonstrate the superior performance of our methods in comparison with several existing approaches in terms of network reconstruction and functional selection. We apply our methods to a gene expression dataset in glioblastoma multiforme, which reveals several interesting and biologically relevant nonlinear relationships. PMID:25854759

  15. Cloning and Sequencing the First HLA Gene

    PubMed Central

    Jordan, Bertrand R.

    2010-01-01

    This Perspectives article recounts the isolation and sequencing of the first human histocompatibility gene (HLA) in 1980–1981. At the time, general knowledge of the molecules of the immune system was already fairly extensive, and gene rearrangements in the immunoglobulin complex (discovered in 1976) had generated much excitement: HLA was quite obviously the next frontier. The author was able to use a homologous murine H-2 cDNA to identify putative human HLA genomic clones in a λ-phage library and thus to isolate and sequence the first human histocompatibility gene. This personal account relates the steps that led to this result, describes the highly competitive international environment, and highlights the role of location, connections, and sheer luck in such an achievement. It also puts this work in perspective with a short description of the current knowledge of histocompatibility genes and, finally, presents some reflections on the meaning of “discovery.” PMID:20457890

  16. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    SciTech Connect

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  17. In silico evolution of the hunchback gene indicates redundancy in cis-regulatory organization and spatial gene expression.

    PubMed

    Zagrijchuk, Elizaveta A; Sabirov, Marat A; Holloway, David M; Spirov, Alexander V

    2014-04-01

    Biological development depends on the coordinated expression of genes in time and space. Developmental genes have extensive cis-regulatory regions which control their expression. These regions are organized in a modular manner, with different modules controlling expression at different times and locations. Both how modularity evolved and what function it serves are open questions. We present a computational model for the cis-regulation of the hunchback (hb) gene in the fruit fly (Drosophila). We simulate evolution (using an evolutionary computation approach from computer science) to find the optimal cis-regulatory arrangements for fitting experimental hb expression patterns. We find that the cis-regulatory region tends to readily evolve modularity. These cis-regulatory modules (CRMs) do not tend to control single spatial domains, but show a multi-CRM/multi-domain correspondence. We find that the CRM-domain correspondence seen in Drosophila evolves with a high probability in our model, supporting the biological relevance of the approach. The partial redundancy resulting from multi-CRM control may confer some biological robustness against corruption of regulatory sequences. The technique developed on hb could readily be applied to other multi-CRM developmental genes.

  18. In silico evolution of the hunchback gene indicates redundancy in cis-regulatory organization and spatial gene expression

    PubMed Central

    Zagrijchuk, Elizaveta A.; Sabirov, Marat A.; Holloway, David M.; Spirov, Alexander V.

    2014-01-01

    Biological development depends on the coordinated expression of genes in time and space. Developmental genes have extensive cis-regulatory regions which control their expression. These regions are organized in a modular manner, with different modules controlling expression at different times and locations. Both how modularity evolved and what function it serves are open questions. We present a computational model for the cis-regulation of the hunchback (hb) gene in the fruit fly (Drosophila). We simulate evolution (using an evolutionary computation approach from computer science) to find the optimal cis-regulatory arrangements for fitting experimental hb expression patterns. We find that the cis-regulatory region tends to readily evolve modularity. These cis-regulatory modules (CRMs) do not tend to control single spatial domains, but show a multi-CRM/multi-domain correspondence. We find that the CRM-domain correspondence seen in Drosophila evolves with a high probability in our model, supporting the biological relevance of the approach. The partial redundancy resulting from multi-CRM control may confer some biological robustness against corruption of regulatory sequences. The technique developed on hb could readily be applied to other multi-CRM developmental genes. PMID:24712536

  19. Role of Conserved Non-Coding Regulatory Elements in LMW Glutenin Gene Expression

    PubMed Central

    Juhász, Angéla; Makai, Szabolcs; Sebestyén, Endre; Tamás, László; Balázs, Ervin

    2011-01-01

    Transcriptional regulation of LMW glutenin genes were investigated in-silico, using publicly available gene sequences and expression data. Genes were grouped into different LMW glutenin types and their promoter profiles were determined using cis-acting regulatory elements databases and published results. The various cis-acting elements belong to some conserved non-coding regulatory regions (CREs) and might act in two different ways. There are elements, such as GCN4 motifs found in the long endosperm box that could serve as key factors in tissue-specific expression. Some other elements, such as the AACA/TA motifs or the individual prolamin box variants, might modulate the level of expression. Based on the promoter sequences and expression characteristic LMW glutenin genes might be transcribed following two different mechanisms. Most of the s- and i-type genes show a continuously increasing expression pattern. The m-type genes, however, demonstrate normal distribution in their expression profiles. Differences observed in their expression could be related to the differences found in their promoter sequences. Polymorphisms in the number and combination of cis-acting elements in their promoter regions can be of crucial importance in the diverse levels of production of single LMW glutenin gene types. PMID:22242127

  20. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  1. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  2. Cardiac gene regulatory networks in Drosophila

    PubMed Central

    Bryantsev, Anton L.; Cripps, Richard M.

    2009-01-01

    The Drosophila system has proven a powerful tool to help unlock the regulatory processes that occur during specification and differentiation of the embryonic heart. In this review, we focus upon a temporal analysis of the molecular events that result in heart formation in Drosophila, with a particular emphasis upon how genomic and other cuttingedge approaches are being brought to bear upon the subject. We anticipate that systemslevel approaches will contribute greatly to our comprehension of heart development and disease in the animal kingdom. PMID:18849017

  3. Regulatory links between imprinted genes: evolutionary predictions and consequences

    PubMed Central

    Patten, Manus M.; Cowley, Michael; Oakey, Rebecca J.; Feil, Robert

    2016-01-01

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species. PMID:26842569

  4. Regulatory links between imprinted genes: evolutionary predictions and consequences.

    PubMed

    Patten, Manus M; Cowley, Michael; Oakey, Rebecca J; Feil, Robert

    2016-02-10

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species.

  5. Nemertean Toxin Genes Revealed through Transcriptome Sequencing

    PubMed Central

    Whelan, Nathan V.; Kocot, Kevin M.; Santos, Scott R.; Halanych, Kenneth M.

    2014-01-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63–74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  6. Gene Regulatory Evolution During Speciation in a Songbird

    PubMed Central

    Davidson, John H.; Balakrishnan, Christopher N.

    2016-01-01

    Over the last decade, tremendous progress has been made toward a comparative understanding of gene regulatory evolution. However, we know little about how gene regulation evolves in birds, and how divergent genomes interact in their hybrids. Because of the unique features of birds – female heterogamety, a highly conserved karyotype, and the slow evolution of reproductive incompatibilities – an understanding of regulatory evolution in birds is critical to a comprehensive understanding of regulatory evolution and its implications for speciation. Using a novel complement of analyses of replicated RNA-seq libraries, we demonstrate abundant divergence in brain gene expression between zebra finch (Taeniopygia guttata) subspecies. By comparing parental populations and their F1 hybrids, we also show that gene misexpression is relatively rare among brain-expressed transcripts in male birds. If this pattern is consistent across tissues and sexes, it may partially explain the slow buildup of postzygotic reproductive isolation observed in birds relative to other taxa. Although we expected that the action of genetic drift on the island-dwelling zebra finch subspecies would be manifested in a higher rate of trans regulatory divergence, we found that most divergence was in cis regulation, following a pattern commonly observed in other taxa. Thus, our study highlights both unique and shared features of avian regulatory evolution. PMID:26976438

  7. Novel genes dramatically alter regulatory network topology in amphioxus

    PubMed Central

    Zhang, Qing; Zmasek, Christian M; Dishaw, Larry J; Mueller, M Gail; Ye, Yuzhen; Litman, Gary W; Godzik, Adam

    2008-01-01

    Background Regulation in protein networks often utilizes specialized domains that 'join' (or 'connect') the network through specific protein-protein interactions. The innate immune system, which provides a first and, in many species, the only line of defense against microbial and viral pathogens, is regulated in this way. Amphioxus (Branchiostoma floridae), whose genome was recently sequenced, occupies a unique position in the evolution of innate immunity, having diverged within the chordate lineage prior to the emergence of the adaptive immune system in vertebrates. Results The repertoire of several families of innate immunity proteins is expanded in amphioxus compared to both vertebrates and protostome invertebrates. Part of this expansion consists of genes encoding proteins with unusual domain architectures, which often contain both upstream receptor and downstream activator domains, suggesting a potential role for direct connections (shortcuts) that bypass usual signal transduction pathways. Conclusion Domain rearrangements can potentially alter the topology of protein-protein interaction (and regulatory) networks. The extent of such arrangements in the innate immune network of amphioxus suggests that domain shuffling, which is an important mechanism in the evolution of multidomain proteins, has also shaped the development of immune systems. PMID:18680598

  8. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    PubMed

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  9. Motifs emerge from function in model gene regulatory networks

    PubMed Central

    Burda, Z.; Krzywicki, A.; Martin, O. C.; Zagorski, M.

    2011-01-01

    Gene regulatory networks allow the control of gene expression patterns in living cells. The study of network topology has revealed that certain subgraphs of interactions or “motifs” appear at anomalously high frequencies. We ask here whether this phenomenon may emerge because of the functions carried out by these networks. Given a framework for describing regulatory interactions and dynamics, we consider in the space of all regulatory networks those that have prescribed functional capabilities. Markov Chain Monte Carlo sampling is then used to determine how these functional networks lead to specific motif statistics in the interactions. In the case where the regulatory networks are constrained to exhibit multistability, we find a high frequency of gene pairs that are mutually inhibitory and self-activating. In contrast, networks constrained to have periodic gene expression patterns (mimicking for instance the cell cycle) have a high frequency of bifan-like motifs involving four genes with at least one activating and one inhibitory interaction. PMID:21960444

  10. Full-Length Minor Ampullate Spidroin Gene Sequence

    PubMed Central

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  11. Full-length minor ampullate spidroin gene sequence.

    PubMed

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  12. Evolution of DNA specificity in a transcription factor family produced a new gene regulatory module.

    PubMed

    McKeown, Alesia N; Bridgham, Jamie T; Anderson, Dave W; Murphy, Michael N; Ortlund, Eric A; Thornton, Joseph W

    2014-09-25

    Complex gene regulatory networks require transcription factors (TFs) to bind distinct DNA sequences. To understand how novel TF specificity evolves, we combined phylogenetic, biochemical, and biophysical approaches to interrogate how DNA recognition diversified in the steroid hormone receptor (SR) family. After duplication of the ancestral SR, three mutations in one copy radically weakened binding to the ancestral estrogen response element (ERE) and improved binding to a new set of DNA sequences (steroid response elements, SREs). They did so by establishing unfavorable interactions with ERE and abolishing unfavorable interactions with SRE; also required were numerous permissive substitutions, which nonspecifically improved cooperativity and affinity of DNA binding. Our findings indicate that negative determinants of binding play key roles in TFs' DNA selectivity and-with our prior work on the evolution of SR ligand specificity during the same interval-show how a specific new gene regulatory module evolved without interfering with the integrity of the ancestral module. PMID:25259920

  13. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila.

    PubMed

    Ho, Margaret C W; Johnsen, Holly; Goetz, Sara E; Schiller, Benjamin J; Bae, Esther; Tran, Diana A; Shur, Andrey S; Allen, John M; Rau, Christoph; Bender, Welcome; Fisher, William W; Celniker, Susan E; Drewell, Robert A

    2009-11-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.

  14. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design.

  15. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design. PMID:26833485

  16. Nucleotide sequence of the alpha-amylase-pullulanase gene from Clostridium thermohydrosulfuricum.

    PubMed

    Melasniemi, H; Paloheimo, M; Hemiö, L

    1990-03-01

    The nucleotide sequence of the gene (apu) encoding the thermostable alpha-amylase-pullulanase of Clostridium thermohydrosulfuricum was determined. An open reading frame of 4425 bp was present. The deduced polypeptide (Mr 165,600), including a 31 amino acid putative signal sequence, comprised 1475 amino acids, with no cysteine residues. The structural gene was preceded by the consensus promoter sequence TTGACA TATAAT, a putative regulatory sequence and a putative ribosome-binding sequence AAAGGGGG. The codon usage resembled that of Bacillus genes. The deduced sequence of the mature apu product showed similarities to various amylolytic enzymes, especially the neopullulanase of Bacillus stearothermophilus, whereas the signal sequence showed similarity to those of the alpha-amylases of B. stearothermophilus and B. subtilis. Three regions thought to be highly conserved in the primary structure of alpha-amylases could also be distinguished in the apu product, two being partly 'duplicated' in this alpha-1,4/alpha-1,6-active enzyme.

  17. Approaches to modeling gene regulatory networks: a gentle introduction.

    PubMed

    Schlitt, Thomas

    2013-01-01

    This chapter is split into two main sections; first, I will present an introduction to gene networks. Second, I will discuss various approaches to gene network modeling which will include some examples for using different data sources. Computational modeling has been used for many different biological systems and many approaches have been developed addressing the different needs posed by the different application fields. The modeling approaches presented here are not limited to gene regulatory networks and occasionally I will present other examples. The material covered here is an update based on several previous publications by Thomas Schlitt and Alvis Brazma (FEBS Lett 579(8),1859-1866, 2005; Philos Trans R Soc Lond B Biol Sci 361(1467), 483-494, 2006; BMC Bioinformatics 8(suppl 6), S9, 2007) that formed the foundation for a lecture on gene regulatory networks at the In Silico Systems Biology workshop series at the European Bioinformatics Institute in Hinxton. PMID:23715978

  18. Identification and characterization of the afsR homologue regulatory gene from Streptomyces peucetius ATCC 27952.

    PubMed

    Parajuli, Niranjan; Viet, Hung Trinh; Ishida, Kenji; Tong, Hang Thi; Lee, Hei Chan; Liou, Kwangkyoung; Sohng, Jae Kyung

    2005-01-01

    We have isolated an afsR homologue, called afsR-p, through genome analysis of Streptomyces peucetius ATCC 27952. AfsR-p shares 60% sequence identity with AfsR from Streptomyces coelicolor A3 (2). afsR-p was expressed under the control of the ermE* promoter in its hosts S. peucetius, Streptomyces lividans TK 24, Streptomyces clavuligerus and Streptomyces griseus. We observed overproduction of doxorubicin (4-fold) in S. peucetius, gamma-actinorhodin (2.6-fold) in S. lividans, clavulanic acid (1.5-fold) in S. clavuligerus and streptomycin (slight) in S. griseus. Overproduction was due to expression of the gene in these strains as compared to the wild-type strains harboring the vector only. Comparative study of the expression of afsR-p revealed that regulatory networking in Streptomyces is not uniform. We speculate that phosphorylated AfsR-p becomes bound to the promoter region of afsS. The latter activates other regulatory genes, including pathway regulatory genes, and induces the production of secondary metabolites including antibiotics. We identified specific conserved amino acids and exploited them for the isolation of the partial sequence of the afsR homologue from S. clavuligerus and Streptomyces achromogens (rubradirin producer). Such findings provide additional evidence for the presence of a serine/threonine and tyrosine kinase-dependent global regulatory network in Streptomyces.

  19. Identification and characterization of the afsR homologue regulatory gene from Streptomyces peucetius ATCC 27952.

    PubMed

    Parajuli, Niranjan; Viet, Hung Trinh; Ishida, Kenji; Tong, Hang Thi; Lee, Hei Chan; Liou, Kwangkyoung; Sohng, Jae Kyung

    2005-01-01

    We have isolated an afsR homologue, called afsR-p, through genome analysis of Streptomyces peucetius ATCC 27952. AfsR-p shares 60% sequence identity with AfsR from Streptomyces coelicolor A3 (2). afsR-p was expressed under the control of the ermE* promoter in its hosts S. peucetius, Streptomyces lividans TK 24, Streptomyces clavuligerus and Streptomyces griseus. We observed overproduction of doxorubicin (4-fold) in S. peucetius, gamma-actinorhodin (2.6-fold) in S. lividans, clavulanic acid (1.5-fold) in S. clavuligerus and streptomycin (slight) in S. griseus. Overproduction was due to expression of the gene in these strains as compared to the wild-type strains harboring the vector only. Comparative study of the expression of afsR-p revealed that regulatory networking in Streptomyces is not uniform. We speculate that phosphorylated AfsR-p becomes bound to the promoter region of afsS. The latter activates other regulatory genes, including pathway regulatory genes, and induces the production of secondary metabolites including antibiotics. We identified specific conserved amino acids and exploited them for the isolation of the partial sequence of the afsR homologue from S. clavuligerus and Streptomyces achromogens (rubradirin producer). Such findings provide additional evidence for the presence of a serine/threonine and tyrosine kinase-dependent global regulatory network in Streptomyces. PMID:15921897

  20. A novel regulatory element between the human FGA and FGG genes.

    PubMed

    Fish, Richard J; Neerman-Arbez, Marguerite

    2012-09-01

    High circulating fibrinogen levels correlate with cardiovascular disease (CVD) risk. Fibrinogen levels vary between people and also change in response to physiological and environmental stimuli. A modest proportion of the variation in fibrinogen levels can be explained by genotype, inferring that variation in genomic sequences that regulate the fibrinogen genes ( FGA , FGB and FGG ) may affect hepatic fibrinogen production and perhaps CVD risk. We previously identified a conserved liver enhancer in the fibrinogen gene cluster (CNC12), between FGB and FGA . Genome-wide Chromatin immunoprecipitation-sequencing (ChIP-seq) demonstrated that transcription factors which bind fibrinogen gene promoters also interact with CNC12, as well as two potential fibrinogen enhancers (PFE), between FGA and FGG . Here we show that one of the PFE sequences has potent hepatocyte enhancer activity. Using a luciferase reporter gene system, we found that PFE2 enhances minimal promoter- and FGA promoter-driven gene expression in hepatoma cells, regardless of its orientation with respect to the promoters. A region within PFE2 bears a short series of conserved nucleotides which maintain enhancer activity without flanking sequence. We also demonstrate that PFE2 is a liver enhancer in vivo, driving enhanced green fluorescent protein expression in transgenic zebrafish larval livers. Our study shows that combining public domain ChIP-seq data with in vitro and in vivo functional tests can identify novel fibrinogen gene cluster regulatory sequences. Variation in such elements could affect fibrinogen production and influence CVD risk.

  1. A gene regulatory network armature for T-lymphocyte specification

    SciTech Connect

    Fung, Elizabeth-sharon

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  2. Asymmetric Regulation of Peripheral Genes by Two Transcriptional Regulatory Networks

    PubMed Central

    Li, Jing-Ru; Suzuki, Takahiro; Nishimura, Hajime; Kishima, Mami; Maeda, Shiori; Suzuki, Harukazu

    2016-01-01

    Transcriptional regulatory network (TRN) reconstitution and deconstruction occur simultaneously during reprogramming; however, it remains unclear how the starting and targeting TRNs regulate the induction and suppression of peripheral genes. Here we analyzed the regulation using direct cell reprogramming from human dermal fibroblasts to monocytes as the platform. We simultaneously deconstructed fibroblastic TRN and reconstituted monocytic TRN; monocytic and fibroblastic gene expression were analyzed in comparison with that of fibroblastic TRN deconstruction only or monocytic TRN reconstitution only. Global gene expression analysis showed cross-regulation of TRNs. Detailed analysis revealed that knocking down fibroblastic TRN positively affected half of the upregulated monocytic genes, indicating that intrinsic fibroblastic TRN interfered with the expression of induced genes. In contrast, reconstitution of monocytic TRN showed neutral effects on the majority of fibroblastic gene downregulation. This study provides an explicit example that demonstrates how two networks together regulate gene expression during cell reprogramming processes and contributes to the elaborate exploration of TRNs. PMID:27483142

  3. Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

    PubMed Central

    Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

    2012-01-01

    Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to

  4. Noise Control in Gene Regulatory Networks with Negative Feedback.

    PubMed

    Hinczewski, Michael; Thirumalai, D

    2016-07-01

    Genes and proteins regulate cellular functions through complex circuits of biochemical reactions. Fluctuations in the components of these regulatory networks result in noise that invariably corrupts the signal, possibly compromising function. Here, we create a practical formalism based on ideas introduced by Wiener and Kolmogorov (WK) for filtering noise in engineered communications systems to quantitatively assess the extent to which noise can be controlled in biological processes involving negative feedback. Application of the theory, which reproduces the previously proven scaling of the lower bound for noise suppression in terms of the number of signaling events, shows that a tetracycline repressor-based negative-regulatory gene circuit behaves as a WK filter. For the class of Hill-like nonlinear regulatory functions, this type of filter provides the optimal reduction in noise. Our theoretical approach can be readily combined with experimental measurements of response functions in a wide variety of genetic circuits, to elucidate the general principles by which biological networks minimize noise.

  5. Function does not follow form in gene regulatory circuits

    PubMed Central

    Payne, Joshua L.; Wagner, Andreas

    2015-01-01

    Gene regulatory circuits are to the cell what arithmetic logic units are to the chip: fundamental components of information processing that map an input onto an output. Gene regulatory circuits come in many different forms, distinct structural configurations that determine who regulates whom. Studies that have focused on the gene expression patterns (functions) of circuits with a given structure (form) have examined just a few structures or gene expression patterns. Here, we use a computational model to exhaustively characterize the gene expression patterns of nearly 17 million three-gene circuits in order to systematically explore the relationship between circuit form and function. Three main conclusions emerge. First, function does not follow form. A circuit of any one structure can have between twelve and nearly thirty thousand distinct gene expression patterns. Second, and conversely, form does not follow function. Most gene expression patterns can be realized by more than one circuit structure. And third, multifunctionality severely constrains circuit form. The number of circuit structures able to drive multiple gene expression patterns decreases rapidly with the number of these patterns. These results indicate that it is generally not possible to infer circuit function from circuit form, or vice versa. PMID:26290154

  6. A Genome-Wide Regulatory Framework Identifies Maize Pericarp Color1 Controlled Genes[C][W

    PubMed Central

    Morohashi, Kengo; Casas, María Isabel; Ferreyra, Lorena Falcone; Mejía-Guerra, María Katherine; Pourcel, Lucille; Yilmaz, Alper; Feller, Antje; Carvalho, Bruna; Emiliani, Julia; Rodriguez, Eduardo; Pellegrinet, Silvina; McMullen, Michael; Casati, Paula; Grotewold, Erich

    2012-01-01

    Pericarp Color1 (P1) encodes an R2R3-MYB transcription factor responsible for the accumulation of insecticidal flavones in maize (Zea mays) silks and red phlobaphene pigments in pericarps and other floral tissues, which makes P1 an important visual marker. Using genome-wide expression analyses (RNA sequencing) in pericarps and silks of plants with contrasting P1 alleles combined with chromatin immunoprecipitation coupled with high-throughput sequencing, we show here that the regulatory functions of P1 are much broader than the activation of genes corresponding to enzymes in a branch of flavonoid biosynthesis. P1 modulates the expression of several thousand genes, and ∼1500 of them were identified as putative direct targets of P1. Among them, we identified F2H1, corresponding to a P450 enzyme that converts naringenin into 2-hydroxynaringenin, a key branch point in the P1-controlled pathway and the first step in the formation of insecticidal C-glycosyl flavones. Unexpectedly, the binding of P1 to gene regulatory regions can result in both gene activation and repression. Our results indicate that P1 is the major regulator for a set of genes involved in flavonoid biosynthesis and a minor modulator of the expression of a much larger gene set that includes genes involved in primary metabolism and production of other specialized compounds. PMID:22822204

  7. Compartmentalized gene regulatory network of the pathogenic fungus Fusarium graminearum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Head blight caused by Fusarium graminearum (Fg) is a major limiting factor of wheat production with both yield loss and mycotoxin contamination. Here we report a model for global Fg gene regulatory networks (GRNs) inferred from a large collection of transcriptomic data using a machine-learning appro...

  8. Molecular characterization of a maize regulatory gene

    SciTech Connect

    Wessler, S.R.

    1991-12-01

    Based on initial bombardment studies we have previously concluded that promoter diversity was responsible for the diversity of naturally occurring R alleles. During this period we have found that R is controlled at the level of translation initiation and intron 1 is alternatively spliced. The experiments described in Sections 1 and 2 sought to quantify these effects and to determine whether they contribute to the tissue specific expression of select R alleles. This study was done because very little is understood about the post-transcriptional regulation of plant genes. Section 3 and 4 describe experiments designed to identify important structural components of the R protein.

  9. Boosting heterologous protein production in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Scheffer, Stanley; Jacobs, Anni; Zambre, Mukund; Zobell, Oliver; Goossens, Alain; Depicker, Ann; Angenon, Geert

    2002-12-01

    Over the past decade, several high value proteins have been produced in different transgenic plant tissues such as leaves, tubers, and seeds. Despite recent advances, many heterologous proteins accumulate to low concentrations, and the optimization of expression cassettes to make in planta production and purification economically feasible remains critical. Here, the regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) of common bean (Phaseolus vulgaris) were evaluated for producing heterologous proteins in dicotyledonous seeds. The murine single chain variable fragment (scFv) G4 (ref. 4) was chosen as model protein because of the current industrial interest in producing antibodies and derived fragments in crops. In transgenic Arabidopsis thaliana seed stocks, the scFv under control of the 35S promoter of the cauliflower mosaic virus (CaMV) accumulated to approximately 1% of total soluble protein (TSP). However, a set of seed storage promoter constructs boosted the scFv accumulation to exceptionally high concentrations, reaching no less than 36.5% of TSP in homozygous seeds. Even at these high concentrations, the scFv proteins had antigen-binding activity and affinity similar to those produced in Escherichia coli. The feasibility of heterologous protein production under control of arc5-I regulatory sequences was also demonstrated in Phaseolus acutifolius, a promising crop for large scale production.

  10. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library

    PubMed Central

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Aim Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. Methods After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Results Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. Significance This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes

  11. Neuronal precursor-specific activity of a human doublecortin regulatory sequence.

    PubMed

    Karl, Claudia; Couillard-Despres, Sebastien; Prang, Peter; Munding, Matthias; Kilb, Werner; Brigadski, Tanja; Plötz, Sonja; Mages, Wolfgang; Luhmann, Heiko; Winkler, Jürgen; Bogdahn, Ulrich; Aigner, Ludwig

    2005-01-01

    The doublecortin (DCX) gene encodes a 40-kDa microtubule-associated protein specifically expressed in neuronal precursors of the developing and adult CNS. Due to its specific expression pattern, attention was drawn to DCX as a marker for neuronal precursors and neurogenesis, thereby underscoring the importance of its promoter identification and promoter analysis. Here, we analysed the human DCX regulatory sequence and confined it to a 3.5-kb fragment upstream of the ATG start codon. We demonstrate by transient transfection experiments that this fragment is sufficient and specific to drive expression of reporter genes in embryonic and adult neuronal precursors. The activity of this regulatory fragment overlapped with the expression of endogenous DCX and with the young neuronal markers class III beta-tubulin isotype and microtubule-associated protein Map2ab but not with glial or oligodendroglial markers. Electrophysiological data further confirmed the immature neuronal nature of these cells. Deletions within the 3.5-kb region demonstrated the relevance of specific regions containing transcription factor-binding sites. Moreover, application of neurogenesis-related growth factors in the neuronal precursor cultures suggested the lack of direct signalling of these factors on the DCX promoter construct. PMID:15663475

  12. Inferring slowly-changing dynamic gene-regulatory networks.

    PubMed

    Wit, Ernst C; Abbruzzo, Antonino

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with l1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset.

  13. Inferring slowly-changing dynamic gene-regulatory networks

    PubMed Central

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with ℓ1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset. PMID:25917062

  14. Data- and knowledge-based modeling of gene regulatory networks: an update

    PubMed Central

    Linde, Jörg; Schulze, Sylvie; Henkel, Sebastian G.; Guthke, Reinhard

    2015-01-01

    Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions. PMID:27047314

  15. Data- and knowledge-based modeling of gene regulatory networks: an update.

    PubMed

    Linde, Jörg; Schulze, Sylvie; Henkel, Sebastian G; Guthke, Reinhard

    2015-01-01

    Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions.

  16. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    PubMed Central

    Chitra, M. Ananda; Jayanthy, C.; Nagarajan, B.

    2015-01-01

    SP contains serine and produce lactone ring structured AIP. Conclusion: Presence of AgrA, B, and D in all SP isolates implies the importance of this regulatory system in the virulence genes expression of the SP bacteria. SP isolates can be typed based on the AgrD auto-inducible protein sequences as it is being carried out for typing of S. aureus isolates. However, further studies are required to elucidate the mechanism of controlling of virulence genes by agr gene locus in the pathogenesis of soft tissue infection by SP. PMID:27047173

  17. Establishing the Architecture of Plant Gene Regulatory Networks.

    PubMed

    Yang, F; Ouma, W Z; Li, W; Doseff, A I; Grotewold, E

    2016-01-01

    Gene regulatory grids (GRGs) encompass the space of all the possible transcription factor (TF)-target gene interactions that regulate gene expression, with gene regulatory networks (GRNs) representing a temporal and spatial manifestation of a portion of the GRG, essential for the specification of gene expression. Thus, understanding GRG architecture provides a valuable tool to explain how genes are expressed in an organism, an important aspect of synthetic biology and essential toward the development of the "in silico" cell. Progress has been made in some unicellular model systems (eg, yeast), but significant challenges remain in more complex multicellular organisms such as plants. Key to understanding the organization of GRGs is therefore identifying the genes that TFs bind to, and control. The application of sensitive and high-throughput methods to investigate genome-wide TF-target gene interactions is providing a wealth of information that can be linked to important agronomic traits. We describe here the methods and resources that have been developed to investigate the architecture of plant GRGs and GRNs. We also provide information regarding where to obtain clones or other resources necessary for synthetic biology or metabolic engineering. PMID:27480690

  18. Implicit methods for qualitative modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Mohanram, Kartik; De Micheli, Giovanni; Xenarios, Ioannis

    2012-01-01

    Advancements in high-throughput technologies to measure increasingly complex biological phenomena at the genomic level are rapidly changing the face of biological research from the single-gene single-protein experimental approach to studying the behavior of a gene in the context of the entire genome (and proteome). This shift in research methodologies has resulted in a new field of network biology that deals with modeling cellular behavior in terms of network structures such as signaling pathways and gene regulatory networks. In these networks, different biological entities such as genes, proteins, and metabolites interact with each other, giving rise to a dynamical system. Even though there exists a mature field of dynamical systems theory to model such network structures, some technical challenges are unique to biology such as the inability to measure precise kinetic information on gene-gene or gene-protein interactions and the need to model increasingly large networks comprising thousands of nodes. These challenges have renewed interest in developing new computational techniques for modeling complex biological systems. This chapter presents a modeling framework based on Boolean algebra and finite-state machines that are reminiscent of the approach used for digital circuit synthesis and simulation in the field of very-large-scale integration (VLSI). The proposed formalism enables a common mathematical framework to develop computational techniques for modeling different aspects of the regulatory networks such as steady-state behavior, stochasticity, and gene perturbation experiments.

  19. Regulatory hotspots are associated with plant gene expression under varying soil phosphorus supply in Brassica rapa.

    PubMed

    Hammond, John P; Mayes, Sean; Bowen, Helen C; Graham, Neil S; Hayden, Rory M; Love, Christopher G; Spracklen, William P; Wang, Jun; Welham, Sue J; White, Philip J; King, Graham J; Broadley, Martin R

    2011-07-01

    Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have defined cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identification of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efficiency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efficiency identified through eQTL analysis are potential targets for further characterization and may have potential for crop improvement.

  20. Regulatory hotspots are associated with plant gene expression under varying soil phosphorus supply in Brassica rapa.

    PubMed

    Hammond, John P; Mayes, Sean; Bowen, Helen C; Graham, Neil S; Hayden, Rory M; Love, Christopher G; Spracklen, William P; Wang, Jun; Welham, Sue J; White, Philip J; King, Graham J; Broadley, Martin R

    2011-07-01

    Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have defined cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identification of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efficiency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efficiency identified through eQTL analysis are potential targets for further characterization and may have potential for crop improvement. PMID:21527424

  1. Analysis of gene regulatory networks in the mammalian circadian rhythm.

    PubMed

    Yan, Jun; Wang, Haifang; Liu, Yuting; Shao, Chunxuan

    2008-10-01

    Circadian rhythm is fundamental in regulating a wide range of cellular, metabolic, physiological, and behavioral activities in mammals. Although a small number of key circadian genes have been identified through extensive molecular and genetic studies in the past, the existence of other key circadian genes and how they drive the genomewide circadian oscillation of gene expression in different tissues still remains unknown. Here we try to address these questions by integrating all available circadian microarray data in mammals. We identified 41 common circadian genes that showed circadian oscillation in a wide range of mouse tissues with a remarkable consistency of circadian phases across tissues. Comparisons across mouse, rat, rhesus macaque, and human showed that the circadian phases of known key circadian genes were delayed for 4-5 hours in rat compared to mouse and 8-12 hours in macaque and human compared to mouse. A systematic gene regulatory network for the mouse circadian rhythm was constructed after incorporating promoter analysis and transcription factor knockout or mutant microarray data. We observed the significant association of cis-regulatory elements: EBOX, DBOX, RRE, and HSE with the different phases of circadian oscillating genes. The analysis of the network structure revealed the paths through which light, food, and heat can entrain the circadian clock and identified that NR3C1 and FKBP/HSP90 complexes are central to the control of circadian genes through diverse environmental signals. Our study improves our understanding of the structure, design principle, and evolution of gene regulatory networks involved in the mammalian circadian rhythm.

  2. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  3. Gap Gene Regulatory Dynamics Evolve along a Genotype Network.

    PubMed

    Crombach, Anton; Wotton, Karl R; Jiménez-Guri, Eva; Jaeger, Johannes

    2016-05-01

    Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as "system drift." System drift is illustrated by the gap gene network-involved in segmental patterning-in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of "genotype networks" and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability).

  4. Gap Gene Regulatory Dynamics Evolve along a Genotype Network

    PubMed Central

    Crombach, Anton; Wotton, Karl R.; Jiménez-Guri, Eva; Jaeger, Johannes

    2016-01-01

    Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as “system drift.” System drift is illustrated by the gap gene network—involved in segmental patterning—in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of “genotype networks” and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability). PMID:26796549

  5. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation.

    PubMed

    Goode, Debbie K; Obier, Nadine; Vijayabaskar, M S; Lie-A-Ling, Michael; Lilly, Andrew J; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-03-01

    Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  6. Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation

    PubMed Central

    Goode, Debbie K.; Obier, Nadine; Vijayabaskar, M.S.; Lie-A-Ling, Michael; Lilly, Andrew J.; Hannah, Rebecca; Lichtinger, Monika; Batta, Kiran; Florkowska, Magdalena; Patel, Rahima; Challinor, Mairi; Wallace, Kirstie; Gilmour, Jane; Assi, Salam A.; Cauchy, Pierre; Hoogenkamp, Maarten; Westhead, David R.; Lacaud, Georges; Kouskoff, Valerie; Göttgens, Berthold; Bonifer, Constanze

    2016-01-01

    Summary Metazoan development involves the successive activation and silencing of specific gene expression programs and is driven by tissue-specific transcription factors programming the chromatin landscape. To understand how this process executes an entire developmental pathway, we generated global gene expression, chromatin accessibility, histone modification, and transcription factor binding data from purified embryonic stem cell-derived cells representing six sequential stages of hematopoietic specification and differentiation. Our data reveal the nature of regulatory elements driving differential gene expression and inform how transcription factor binding impacts on promoter activity. We present a dynamic core regulatory network model for hematopoietic specification and demonstrate its utility for the design of reprogramming experiments. Functional studies motivated by our genome-wide data uncovered a stage-specific role for TEAD/YAP factors in mammalian hematopoietic specification. Our study presents a powerful resource for studying hematopoiesis and demonstrates how such data advance our understanding of mammalian development. PMID:26923725

  7. Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data

    PubMed Central

    Cheng, Chao; Yan, Koon-Kiu; Hwang, Woochang; Qian, Jiang; Bhardwaj, Nitin; Rozowsky, Joel; Lu, Zhi John; Niu, Wei; Alves, Pedro; Kato, Masaomi; Snyder, Michael; Gerstein, Mark

    2011-01-01

    We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications. PMID:22125477

  8. Topological origin of global attractors in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhang, YunJun; Ouyang, Qi; Geng, Zhi

    2015-02-01

    Fixed-point attractors with global stability manifest themselves in a number of gene regulatory networks. This property indicates the stability of regulatory networks against small state perturbations and is closely related to other complex dynamics. In this paper, we aim to reveal the core modules in regulatory networks that determine their global attractors and the relationship between these core modules and other motifs. This work has been done via three steps. Firstly, inspired by the signal transmission in the regulation process, we extract the model of chain-like network from regulation networks. We propose a module of "ideal transmission chain (ITC)", which is proved sufficient and necessary (under certain condition) to form a global fixed-point in the context of chain-like network. Secondly, by examining two well-studied regulatory networks (i.e., the cell-cycle regulatory networks of Budding yeast and Fission yeast), we identify the ideal modules in true regulation networks and demonstrate that the modules have a superior contribution to network stability (quantified by the relative size of the biggest attraction basin). Thirdly, in these two regulation networks, we find that the double negative feedback loops, which are the key motifs of forming bistability in regulation, are connected to these core modules with high network stability. These results have shed new light on the connection between the topological feature and the dynamic property of regulatory networks.

  9. Additive functions in boolean models of gene regulatory network modules.

    PubMed

    Darabos, Christian; Di Cunto, Ferdinando; Tomassini, Marco; Moore, Jason H; Provero, Paolo; Giacobini, Mario

    2011-01-01

    Gene-on-gene regulations are key components of every living organism. Dynamical abstract models of genetic regulatory networks help explain the genome's evolvability and robustness. These properties can be attributed to the structural topology of the graph formed by genes, as vertices, and regulatory interactions, as edges. Moreover, the actual gene interaction of each gene is believed to play a key role in the stability of the structure. With advances in biology, some effort was deployed to develop update functions in boolean models that include recent knowledge. We combine real-life gene interaction networks with novel update functions in a boolean model. We use two sub-networks of biological organisms, the yeast cell-cycle and the mouse embryonic stem cell, as topological support for our system. On these structures, we substitute the original random update functions by a novel threshold-based dynamic function in which the promoting and repressing effect of each interaction is considered. We use a third real-life regulatory network, along with its inferred boolean update functions to validate the proposed update function. Results of this validation hint to increased biological plausibility of the threshold-based function. To investigate the dynamical behavior of this new model, we visualized the phase transition between order and chaos into the critical regime using Derrida plots. We complement the qualitative nature of Derrida plots with an alternative measure, the criticality distance, that also allows to discriminate between regimes in a quantitative way. Simulation on both real-life genetic regulatory networks show that there exists a set of parameters that allows the systems to operate in the critical region. This new model includes experimentally derived biological information and recent discoveries, which makes it potentially useful to guide experimental research. The update function confers additional realism to the model, while reducing the complexity

  10. Additive Functions in Boolean Models of Gene Regulatory Network Modules

    PubMed Central

    Darabos, Christian; Di Cunto, Ferdinando; Tomassini, Marco; Moore, Jason H.; Provero, Paolo; Giacobini, Mario

    2011-01-01

    Gene-on-gene regulations are key components of every living organism. Dynamical abstract models of genetic regulatory networks help explain the genome's evolvability and robustness. These properties can be attributed to the structural topology of the graph formed by genes, as vertices, and regulatory interactions, as edges. Moreover, the actual gene interaction of each gene is believed to play a key role in the stability of the structure. With advances in biology, some effort was deployed to develop update functions in Boolean models that include recent knowledge. We combine real-life gene interaction networks with novel update functions in a Boolean model. We use two sub-networks of biological organisms, the yeast cell-cycle and the mouse embryonic stem cell, as topological support for our system. On these structures, we substitute the original random update functions by a novel threshold-based dynamic function in which the promoting and repressing effect of each interaction is considered. We use a third real-life regulatory network, along with its inferred Boolean update functions to validate the proposed update function. Results of this validation hint to increased biological plausibility of the threshold-based function. To investigate the dynamical behavior of this new model, we visualized the phase transition between order and chaos into the critical regime using Derrida plots. We complement the qualitative nature of Derrida plots with an alternative measure, the criticality distance, that also allows to discriminate between regimes in a quantitative way. Simulation on both real-life genetic regulatory networks show that there exists a set of parameters that allows the systems to operate in the critical region. This new model includes experimentally derived biological information and recent discoveries, which makes it potentially useful to guide experimental research. The update function confers additional realism to the model, while reducing the complexity

  11. Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant.

    PubMed

    Cone, K C; Cocciolone, S M; Burr, F A; Burr, B

    1993-12-01

    Genetic studies in maize have identified several regulatory genes that control the tissue-specific synthesis of purple anthocyanin pigments in the plant. c1 regulates pigmentation in the aleurone layer of the kernel, whereas pigmentation in the vegetative and floral tissues of the plant body depends on pl. c1 encodes a protein with the structural features of eukaryotic transcription factors and functions to control the accumulation of transcripts for the anthocyanin biosynthetic genes. Previous genetic and molecular observations have prompted the hypothesis that c1 and pl are functionally duplicate, in that they control the same set of anthocyanin structural genes but in distinct parts of the plant. Here, we show that this proposed functional similarity is reflected by DNA sequence homology between c1 and pl. Using a c1 DNA fragment as a hybridization probe, genomic and cDNA clones for pl were isolated. Comparison of pl and c1 cDNA sequences revealed that the genes encode proteins with 90% or more amino acid identity in the amino- and carboxyl-terminal domains that are known to be important for the regulatory function of the C1 protein. Consistent with the idea that the pl gene product also acts as a transcriptional activator is our finding that a functional pl allele is required for the transcription of at least three structural genes in the anthocyanin biosynthetic pathway. PMID:8305872

  12. Repressive BMP2 gene regulatory elements near the BMP2 promoter

    SciTech Connect

    Jiang, Shan; Chandler, Ronald L.; Fritz, David T.; Mortlock, Douglas P.; Rogers, Melissa B.

    2010-02-05

    The level of bone morphogenetic protein 2 (BMP2) profoundly influences essential cell behaviors such as proliferation, differentiation, apoptosis, and migration. The spatial and temporal pattern of BMP2 synthesis, particular in diverse embryonic cells, is highly varied and dynamic. We have identified GC-rich sequences within the BMP2 promoter region that strongly repress gene expression. These elements block the activity of a highly conserved, osteoblast enhancer in response to FGF2 treatment. Both positive and negative gene regulatory elements control BMP2 synthesis. Detecting and mapping the repressive motifs is essential because they impede the identification of developmentally regulated enhancers necessary for normal BMP2 patterns and concentration.

  13. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  14. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

    PubMed

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-02-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.

  15. Short DNA sequences inserted for gene targeting can accidentally interfere with off-target gene expression.

    PubMed

    Meier, Ingo D; Bernreuther, Christian; Tilling, Thomas; Neidhardt, John; Wong, Yong Wee; Schulze, Christian; Streichert, Thomas; Schachner, Melitta

    2010-06-01

    Targeting of genes in mice, a key approach to study development and disease, often leaves a neo cassette, loxP, or FRT sites inserted in the mouse genome. Insertion of neo can influence the expression of neighboring genes, but similar effects have not been reported for loxP sites. We therefore performed microarray analyses of mice in which the Ncam or the Tnr gene were targeted either by insertion of neo or loxP/FRT sites. In the case of Ncam, neo, but not loxP/FRT insertion, led to a 2-fold reduction in mRNA levels of 3 genes located at distances between 0.2 and 3.1 Mb from the target. In contrast, after introduction of loxP/FRT sites into introns of Tnr, we observed a 2.5- to 4-fold reduction in the transcript level of the Gas5 gene, 1.1 Mb away from Tnr, most probably due to disruption of a conserved regulatory element in Tnr. Insertion of short DNA sequences such as loxP/FRT can thus influence off-target mRNA levels if these sites are accidentally placed into regulatory elements. Our results imply that conditional knockout mice should be analyzed for genomic positional side effects that may influence the animals' phenotypes. PMID:20110269

  16. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    NASA Astrophysics Data System (ADS)

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-07-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution.

  17. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  18. Gene structure, regulatory control, and evolution of black widow venom latrotoxins

    PubMed Central

    Bhere, Kanaka Varun; Haney, Robert A.; Ayoub, Nadia A.; Garb, Jessica E.

    2014-01-01

    Black widow venom contains α-latrotoxin, infamous for causing intense pain. Combining 33 kb of Latrodectus hesperus genomic DNA with RNA-Seq, we characterized the α-latrotoxin gene and discovered a paralog, 4.5 kb downstream. Both paralogs exhibit venom gland specific transcription, and may be regulated post-transcriptionally via musashi-like proteins. A 4 kb intron interrupts the α-latrotoxin coding sequence, while a 10 kb intron in the 3′ UTR of the paralog may cause nonsense-mediated decay. Phylogenetic analysis confirms these divergent latrotoxins diversified through recent tandem gene duplications. Thus, latrotoxin genes have more complex structures, regulatory controls, and sequence diversity than previously proposed. PMID:25217831

  19. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  20. Gene therapy for cancer: regulatory considerations for approval

    PubMed Central

    Husain, S R; Han, J; Au, P; Shannon, K; Puri, R K

    2015-01-01

    The rapidly changing field of gene therapy promises a number of innovative treatments for cancer patients. Advances in genetic modification of cancer and immune cells and the use of oncolytic viruses and bacteria have led to numerous clinical trials for cancer therapy, with several progressing to late-stage product development. At the time of this writing, no gene therapy product has been approved by the United States Food and Drug Administration (FDA). Some of the key scientific and regulatory issues include understanding of gene transfer vector biology, safety of vectors in vitro and in animal models, optimum gene transfer, long-term persistence or integration in the host, shedding of a virus and ability to maintain transgene expression in vivo for a desired period of time. Because of the biological complexity of these products, the FDA encourages a flexible, data-driven approach for preclinical safety testing programs. The clinical trial design should be based on the unique features of gene therapy products, and should ensure the safety of enrolled subjects. This article focuses on regulatory considerations for gene therapy product development and also discusses guidance documents that have been published by the FDA. PMID:26584531

  1. Gene therapy for cancer: regulatory considerations for approval.

    PubMed

    Husain, S R; Han, J; Au, P; Shannon, K; Puri, R K

    2015-12-01

    The rapidly changing field of gene therapy promises a number of innovative treatments for cancer patients. Advances in genetic modification of cancer and immune cells and the use of oncolytic viruses and bacteria have led to numerous clinical trials for cancer therapy, with several progressing to late-stage product development. At the time of this writing, no gene therapy product has been approved by the United States Food and Drug Administration (FDA). Some of the key scientific and regulatory issues include understanding of gene transfer vector biology, safety of vectors in vitro and in animal models, optimum gene transfer, long-term persistence or integration in the host, shedding of a virus and ability to maintain transgene expression in vivo for a desired period of time. Because of the biological complexity of these products, the FDA encourages a flexible, data-driven approach for preclinical safety testing programs. The clinical trial design should be based on the unique features of gene therapy products, and should ensure the safety of enrolled subjects. This article focuses on regulatory considerations for gene therapy product development and also discusses guidance documents that have been published by the FDA.

  2. From System-Wide Differential Gene Expression to Perturbed Regulatory Factors: A Combinatorial Approach.

    PubMed

    Mahajan, Gaurang; Mande, Shekhar C

    2015-01-01

    High-throughput experiments such as microarrays and deep sequencing provide large scale information on the pattern of gene expression, which undergoes extensive remodeling as the cell dynamically responds to varying environmental cues or has its function disrupted under pathological conditions. An important initial step in the systematic analysis and interpretation of genome-scale expression alteration involves identification of a set of perturbed transcriptional regulators whose differential activity can provide a proximate hypothesis to account for these transcriptomic changes. In the present work, we propose an unbiased and logically natural approach to transcription factor enrichment. It involves overlaying a list of experimentally determined differentially expressed genes on a background regulatory network coming from e.g. literature curation or computational motif scanning, and identifying that subset of regulators whose aggregated target set best discriminates between the altered and the unaffected genes. In other words, our methodology entails testing of all possible regulatory subnetworks, rather than just the target sets of individual regulators as is followed in most standard approaches. We have proposed an iterative search method to efficiently find such a combination, and benchmarked it on E. coli microarray and regulatory network data available in the public domain. Comparative analysis carried out on artificially generated differential expression profiles, as well as empirical factor overexpression data for M. tuberculosis, shows that our methodology provides marked improvement in accuracy of regulatory inference relative to the standard method that involves evaluating factor enrichment in an individual manner. PMID:26562430

  3. From System-Wide Differential Gene Expression to Perturbed Regulatory Factors: A Combinatorial Approach.

    PubMed

    Mahajan, Gaurang; Mande, Shekhar C

    2015-01-01

    High-throughput experiments such as microarrays and deep sequencing provide large scale information on the pattern of gene expression, which undergoes extensive remodeling as the cell dynamically responds to varying environmental cues or has its function disrupted under pathological conditions. An important initial step in the systematic analysis and interpretation of genome-scale expression alteration involves identification of a set of perturbed transcriptional regulators whose differential activity can provide a proximate hypothesis to account for these transcriptomic changes. In the present work, we propose an unbiased and logically natural approach to transcription factor enrichment. It involves overlaying a list of experimentally determined differentially expressed genes on a background regulatory network coming from e.g. literature curation or computational motif scanning, and identifying that subset of regulators whose aggregated target set best discriminates between the altered and the unaffected genes. In other words, our methodology entails testing of all possible regulatory subnetworks, rather than just the target sets of individual regulators as is followed in most standard approaches. We have proposed an iterative search method to efficiently find such a combination, and benchmarked it on E. coli microarray and regulatory network data available in the public domain. Comparative analysis carried out on artificially generated differential expression profiles, as well as empirical factor overexpression data for M. tuberculosis, shows that our methodology provides marked improvement in accuracy of regulatory inference relative to the standard method that involves evaluating factor enrichment in an individual manner.

  4. Establishing neural crest identity: a gene regulatory recipe

    PubMed Central

    Simões-Costa, Marcos; Bronner, Marianne E.

    2015-01-01

    The neural crest is a stem/progenitor cell population that contributes to a wide variety of derivatives, including sensory and autonomic ganglia, cartilage and bone of the face and pigment cells of the skin. Unique to vertebrate embryos, it has served as an excellent model system for the study of cell behavior and identity owing to its multipotency, motility and ability to form a broad array of cell types. Neural crest development is thought to be controlled by a suite of transcriptional and epigenetic inputs arranged hierarchically in a gene regulatory network. Here, we examine neural crest development from a gene regulatory perspective and discuss how the underlying genetic circuitry results in the features that define this unique cell population. PMID:25564621

  5. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  6. Identification of a cis-regulatory element by transient analysis of co-ordinately regulated genes

    PubMed Central

    Dare, Andrew P; Schaffer, Robert J; Lin-Wang, Kui; Allan, Andrew C; Hellens, Roger P

    2008-01-01

    characterise cis-regulatory sequences that are necessary for transcription activation in a complex list of co-ordinately regulated genes. PMID:18601751

  7. Identification of C4 photosynthesis metabolism and regulatory-associated genes in Eleocharis vivipara by SSH.

    PubMed

    Chen, Taiyu; Ye, Rongjian; Fan, Xiaolei; Li, Xianghua; Lin, Yongjun

    2011-09-01

    This is the first effort to investigate the candidate genes involved in kranz developmental regulation and C(4) metabolic fluxes in Eleocharis vivipara, which is a leafless freshwater amphibious plant and possesses a distinct culms anatomy structure and photosynthetic pattern in contrasting environments. A terrestrial specific SSH library was constructed to investigate the genes involved in kranz anatomy developmental regulation and C(4) metabolic fluxes. A total of 73 ESTs and 56 unigenes in 384 clones were identified by array hybridization and sequencing. In total, 50 unigenes had homologous genes in the databases of rice and Arabidopsis. The real-time quantitative PCR results showed that most of the genes were accumulated in terrestrial culms and ABA-induced culms. The C(4) marker genes were stably accumulated during the culms development process in terrestrial culms. With respect to C(3) culms, C(4) photosynthesis metabolism consumed much more transporters and translocators related to ion metabolism, organic acids and carbohydrate metabolism, phosphate metabolism, amino acids metabolism, and lipids metabolism. Additionally, ten regulatory genes including five transcription factors, four receptor-like proteins, and one BURP protein were identified. These regulatory genes, which co-accumulated with the culms developmental stages, may play important roles in culms structure developmental regulation, bundle sheath chloroplast maturation, and environmental response. These results shed new light on the C(4) metabolic fluxes, environmental response, and anatomy structure developmental regulation in E. vivipara.

  8. The first determination of DNA sequence of a specific gene.

    PubMed

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  9. Phase transitions in the evolution of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Skanata, Antun; Kussell, Edo

    The role of gene regulatory networks is to respond to environmental conditions and optimize growth of the cell. A typical example is found in bacteria, where metabolic genes are activated in response to nutrient availability, and are subsequently turned off to conserve energy when their specific substrates are depleted. However, in fluctuating environmental conditions, regulatory networks could experience strong evolutionary pressures not only to turn the right genes on and off, but also to respond optimally under a wide spectrum of fluctuation timescales. The outcome of evolution is predicted by the long-term growth rate, which differentiates between optimal strategies. Here we present an analytic computation of the long-term growth rate in randomly fluctuating environments, by using mean-field and higher order expansion in the environmental history. We find that optimal strategies correspond to distinct regions in the phase space of fluctuations, separated by first and second order phase transitions. The statistics of environmental randomness are shown to dictate the possible evolutionary modes, which either change the structure of the regulatory network abruptly, or gradually modify and tune the interactions between its components.

  10. Transcriptional Targeting in the Airway Using Novel Gene Regulatory Elements

    PubMed Central

    Burnight, Erin R.; Wang, Guoshun; McCray, Paul B.

    2012-01-01

    The delivery of cystic fibrosis transmembrane conductance regulator (CFTR) to airway epithelia is a goal of many gene therapy strategies to treat cystic fibrosis. Because the native regulatory elements of the CFTR are not well characterized, the development of vectors with heterologous promoters of varying strengths and specificity would aid in our selection of optimal reagents for the appropriate expression of the vector-delivered CFTR gene. Here we contrasted the performance of several novel gene-regulatory elements. Based on airway expression analysis, we selected putative regulatory elements from BPIFA1 and WDR65 to investigate. In addition, we selected a human CFTR promoter region (∼ 2 kb upstream of the human CFTR transcription start site) to study. Using feline immunodeficiency virus vectors containing the candidate elements driving firefly luciferase, we transduced murine nasal epithelia in vivo. Luciferase expression persisted for 30 weeks, which was the duration of the experiment. Furthermore, when the nasal epithelium was ablated using the detergent polidocanol, the mice showed a transient loss of luciferase expression that returned 2 weeks after administration, suggesting that our vectors transduced a progenitor cell population. Importantly, the hWDR65 element drove sufficient CFTR expression to correct the anion transport defect in CFTR-null epithelia. These results will guide the development of optimal vectors for sufficient, sustained CFTR expression in airway epithelia. PMID:22447971

  11. Duplication of floral regulatory genes in the Lamiales.

    PubMed

    Aagaard, Jan E; Olmstead, Richard G; Willis, John H; Phillips, Patrick C

    2005-08-01

    Duplication of some floral regulatory genes has occurred repeatedly in angiosperms, whereas others are thought to be single-copy in most lineages. We selected three genes that interact in a pathway regulating floral development conserved among higher tricolpates (LFY/FLO, UFO/FIM, and AP3/DEF) and screened for copy number among families of Lamiales that are closely related to the model species Antirrhinum majus. We show that two of three genes have duplicated at least twice in the Lamiales. Phylogenetic analyses of paralogs suggest that an ancient whole genome duplication shared among many families of Lamiales occurred after the ancestor of these families diverged from the lineage leading to Veronicaceae (including the single-copy species A. majus). Duplication is consistent with previous patterns among angiosperm lineages for AP3/DEF, but this is the first report of functional duplicate copies of LFY/FLO outside of tetraploid species. We propose Lamiales taxa will be good models for understanding mechanisms of duplicate gene preservation and how floral regulatory genes may contribute to morphological diversity. PMID:21646149

  12. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes

    PubMed Central

    Yao, Lijing; Berman, Benjamin P.; Farnham, Peggy J.

    2015-01-01

    Abstract Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer–target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers. PMID:26446758

  13. Isolation of Sparus auratus prolactin gene and activity of the cis-acting regulatory elements.

    PubMed

    Astola, Antonio; Ortiz, Manuela; Calduch-Giner, Josep A; Pérez-Sánchez, Jaume; Valdivia, Manuel M

    2003-10-15

    A sea bream prolactin (sbPRL) gene was isolated using a prolactin cDNA fragment, generated by PCR as a probe. The gene analyzed comprises 3.5 kb of DNA containing five exons as described previously for other fish PRL genes. Analysis of 1.0 kb of the proximal promoter sequence reveals a consensus TATAA box, up to seven (A/T)3NCAT consensus motifs for binding of the pituitary-specific factor Pit-1 and putative CREB and GATA binding sites. CHO culture cells co-transfected with a sbPRL promoter sequence and a sea bream Pit-1 cDNA expression plasmid showed expression of a linked luciferase reporter gene. Transient expression experiments with 5'-delection mutants reveals at least three regulatory regions on the sbPRL gene, two with a stimulatory effect on transcription and one with apparent inhibitory effect. From a comparative point of view, this study of PRL gene in Sparus auratus, correlates well with those previously published on tilapia and rainbow trout. The molecular data reported will be useful for comparative analysis of gene regulation in the GH/PRL gene family in teleosts.

  14. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes.

    PubMed

    Yao, Lijing; Berman, Benjamin P; Farnham, Peggy J

    2015-01-01

    Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer-target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers. PMID:26446758

  15. Direct regulation of knot gene expression by Ultrabithorax and the evolution of cis-regulatory elements in Drosophila.

    PubMed

    Hersh, Bradley M; Carroll, Sean B

    2005-04-01

    The regulation of development by Hox proteins is important in the evolution of animal morphology, but how the regulatory sequences of Hox-regulated target genes function and evolve is unclear. To understand the regulatory organization and evolution of a Hox target gene, we have identified a wing-specific cis-regulatory element controlling the knot gene, which is expressed in the developing Drosophila wing but not the haltere. This regulatory element contains a single binding site that is crucial for activation by the transcription factor Cubitus interruptus (Ci), and a cluster of binding sites for repression by the Hox protein Ultrabithorax (UBX). The negative and positive control regions are physically separable, demonstrating that UBX does not repress by competing for occupancy of Ci-binding sites. Although knot expression is conserved among Drosophila species, this cluster of UBX binding sites is not. We isolated the knot wing cis-regulatory element from D. pseudoobscura, which contains a cluster of UBX-binding sites that is not homologous to the functionally defined D. melanogaster cluster. It is, however, homologous to a second D. melanogaster region containing a cluster of UBX sites that can also function as a repressor element. Thus, the knot regulatory region in D. melanogaster has two apparently functionally redundant blocks of sequences for repression by UBX, both of which are widely separated from activator sequences. This redundancy suggests that the complete evolutionary unit of regulatory control is larger than the minimal experimentally defined control element. The span of regulatory sequences upon which selection acts may, in general, be more expansive and less modular than functional studies of these elements have previously indicated.

  16. An Arabidopsis Gene Regulatory Network for Secondary Cell Wall Synthesis

    PubMed Central

    Taylor-Teeples, M; Lin, L; de Lucas, M; Turco, G; Toal, TW; Gaudinier, A; Young, NF; Trabucco, GM; Veling, MT; Lamothe, R; Handakumbura, PP; Xiong, G; Wang, C; Corwin, J; Tsoukalas, A; Zhang, L; Ware, D; Pauly, M; Kliebenstein, DJ; Dehesh, K; Tagkopoulos, I; Breton, G; Pruneda-Paz, JL; Ahnert, SE; Kay, SA; Hazen, SP; Brady, SM

    2014-01-01

    Summary The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component. PMID:25533953

  17. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    PubMed

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  18. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-12-22

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  19. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  20. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  1. Analysis of Sequences Regulating Larval Expression of the Adh Gene of Drosophila Melanogaster

    PubMed Central

    Shen, NLL.; Hotaling, E. C.; Subrahmanyam, G.; Martin, P. F.; Sofer, W.

    1991-01-01

    The effects of a series of eight, 50 base pair internal deletions in the 5' region upstream of the proximal transcription start site of the Adh gene of Drosophila melanogaster were examined in a quantitative assay. Mixtures of two plasmids, one bearing a deleted gene, the other with an intact reference gene, were injected into alcohol dehydrogenase-negative embryos. Third instar larvae of the injected generation were assayed for relative alcohol dehydrogenase enzyme activity. Quantitative analysis of the eight deletions indicated that two regions were required for any detectable enzyme activity and one region was required for appropriate tissue specificity. The remaining five deletions significantly decreased, but did not eliminate activity. When the deleted genes were placed on a plasmid with an intact reference gene, activities of all but one deletion were restored to levels equivalent to that of the intact reference gene (regardless of orientation). This restoration of activity did not occur when the regulatory region of the intact gene was replaced with the Hsp70 heat shock promoter nor when the 50-base pair deletion encompassed the region that includes the TATA sequence. The fact that seven of the eight deleted genes express activity in the presence of a reference gene on the same plasmid suggests that the deleted gene is controlled by regulatory elements in the reference gene. Further, these regulatory elements exhibit no preference for their own, more proximate, promoter. PMID:1752419

  2. Identification of a conserved sequence in the non-coding regions of many human genes.

    PubMed Central

    Donehower, L A; Slagle, B L; Wilde, M; Darlington, G; Butel, J S

    1989-01-01

    We have analyzed a sequence of approximately 70 base pairs (bp) that shows a high degree of similarity to sequences present in the non-coding regions of a number of human and other mammalian genes. The sequence was discovered in a fragment of human genomic DNA adjacent to an integrated hepatitis B virus genome in cells derived from human hepatocellular carcinoma tissue. When one of the viral flanking sequences was compared to nucleotide sequences in GenBank, more than thirty human genes were identified that contained a similar sequence in their non-coding regions. The sequence element was usually found once or twice in a gene, either in an intron or in the 5' or 3' flanking regions. It did not share any similarities with known short interspersed nucleotide elements (SINEs) or presently known gene regulatory elements. This element was highly conserved at the same position within the corresponding human and mouse genes for myoglobin and N-myc, indicating evolutionary conservation and possible functional importance. Preliminary DNase I footprinting data suggested that the element or its adjacent sequences may bind nuclear factors to generate specific DNase I hypersensitive sites. The size, structure, and evolutionary conservation of this sequence indicates that it is distinct from other types of short interspersed repetitive elements. It is possible that the element may have a cis-acting functional role in the genome. Images PMID:2536922

  3. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  4. Regulatory elements responsible for inducible expression of the granulocyte colony-stimulating factor gene in macrophages.

    PubMed Central

    Nishizawa, M; Nagata, S

    1990-01-01

    Granulocyte colony-stimulating factor (G-CSF) plays an essential role in granulopoiesis during bacterial infection. Macrophages produce G-CSF in response to bacterial endotoxins such as lipopolysaccharide (LPS). To elucidate the mechanism of the induction of G-CSF gene in macrophages or macrophage-monocytes, we have examined regulatory cis elements in the promoter of mouse G-CSF gene. Analyses of linker-scanning and internal deletion mutants of the G-CSF promoter by the chloramphenicol acetyltransferase assay have indicated that at least three regulatory elements are indispensable for the LPS-induced expression of the G-CSF gene in macrophages. When one of the three elements was reiterated and placed upstream of the TATA box of the G-CSF promoter, it mediated inducibility as a tissue-specific and orientation-independent enhancer. Although this element contains a conserved NF-kappa B-like binding site, the gel retardation assay and DNA footprint analysis with nuclear extracts from macrophage cell lines demonstrated that nuclear proteins bind to the DNA sequence downstream of the NF-kappa B-like element, but not to the conserved element itself. The DNA sequence of the binding site was found to have some similarities to the LPS-responsive element which was recently identified in the promoter of the mouse class II major histocompatibility gene. Images PMID:1691438

  5. Propagation of genetic variation in gene regulatory networks

    PubMed Central

    Plahte, Erik; Gjuvsland, Arne B.; Omholt, Stig W.

    2013-01-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network’s feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation. PMID:23997378

  6. Propagation of genetic variation in gene regulatory networks.

    PubMed

    Plahte, Erik; Gjuvsland, Arne B; Omholt, Stig W

    2013-08-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network's feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

  7. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. PMID:23746735

  8. Identification and Analysis of Regulatory Elements in Porcine Bone Morphogenetic Protein 15 Gene Promoter.

    PubMed

    Wan, Qianhui; Wang, Yaxian; Wang, Huayan

    2015-10-27

    Bone morphogenetic protein 15 (BMP15) is secreted by the mammalian oocytes and is indispensable for ovarian follicular development, ovulation, and fertility. To determine the regulation mechanism of BMP15 gene, the regulatory sequence of porcine BMP15 was investigated in this study. The cloned BMP15 promoter retains the cell-type specificity, and is activated in cells derived from ovarian tissue. The luciferase assays in combination with a series of deletion of BMP15 promoter sequence show that the -427 to -376 bp region of BMP15 promoter is the primary regulatory element, in which there are a number of transcription factor binding sites, including LIM homeobox 8 (LHX8), newborn ovary homeobox gene (NOBOX), and paired-like homeodomain transcription factor 1 (PITX1). Determination of tissue-specific expression reveals that LHX8, but not PITX1 and NOBOX, is exclusively expressed in pig ovary tissue and is translocated into the cell nuclei. Overexpression of LHX8 in Chinese hamster ovary (CHO) cells could significantly promote BMP15 promoter activation. This study confirms a key regulatory element that is located in the proximal region of BMP15 promoter and is regulated by the LHX8 factor.

  9. Identification and Analysis of Regulatory Elements in Porcine Bone Morphogenetic Protein 15 Gene Promoter

    PubMed Central

    Wan, Qianhui; Wang, Yaxian; Wang, Huayan

    2015-01-01

    Bone morphogenetic protein 15 (BMP15) is secreted by the mammalian oocytes and is indispensable for ovarian follicular development, ovulation, and fertility. To determine the regulation mechanism of BMP15 gene, the regulatory sequence of porcine BMP15 was investigated in this study. The cloned BMP15 promoter retains the cell-type specificity, and is activated in cells derived from ovarian tissue. The luciferase assays in combination with a series of deletion of BMP15 promoter sequence show that the −427 to −376 bp region of BMP15 promoter is the primary regulatory element, in which there are a number of transcription factor binding sites, including LIM homeobox 8 (LHX8), newborn ovary homeobox gene (NOBOX), and paired-like homeodomain transcription factor 1 (PITX1). Determination of tissue-specific expression reveals that LHX8, but not PITX1 and NOBOX, is exclusively expressed in pig ovary tissue and is translocated into the cell nuclei. Overexpression of LHX8 in Chinese hamster ovary (CHO) cells could significantly promote BMP15 promoter activation. This study confirms a key regulatory element that is located in the proximal region of BMP15 promoter and is regulated by the LHX8 factor. PMID:26516845

  10. Cubozoan crystallins: evidence for convergent evolution of pax regulatory sequences.

    PubMed

    Kozmik, Zbynek; Swamynathan, Shivalingappa K; Ruzickova, Jana; Jonasova, Kristyna; Paces, Vaclav; Vlcek, Cestmir; Piatigorsky, Joram

    2008-01-01

    Cnidaria is the earliest-branching metazoan phylum containing a well-developed, lens-containing visual system located on specialized sensory structures called rhopalia. Each rhopalium in a cubozoan jellyfish Tripedalia cystophora has a large and a small complex, camera-type eye with a cellular lens containing distinct families of crystallins. Here, we have characterized J2-crystallin and its gene in T. cystophora. The J2-crystallin gene is composed of a single exon and encodes a 157-amino acid cytoplasmic protein with no apparent homology to known proteins from other species. The non-lens expression of J2-crystallin suggests nonoptical as well as crystallin functions consistent with the gene-sharing strategy that has been used during evolution of lens crystallins in other invertebrates and vertebrates. Although nonfunctional in transfected mammalian lens cells, the J2-crystallin promoter is activated by the jellyfish paired domain transcription factor PaxB in co-transfection tests via binding to three paired domain sites. PaxB paired domain-binding sites were also identified in the PaxB-regulated promoters of the J1A- and J1B-crystallin genes, which are not homologous to the J2-crystallin gene. Taken together with previous studies on the regulation of the diverse crystallin genes, the present report strongly supports the idea that crystallin recruitment of multifunctional proteins was driven by convergent changes involving Pax (as well as other transcription factors) in the promoters of nonhomologous genes within and between species as well as within gene families. PMID:18184357

  11. Strong early seed-specific gene regulatory region

    SciTech Connect

    Broun, Pierre; Somerville, Chris

    2002-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  12. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    1999-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  13. Partitioning of genetic variation between regulatory and coding gene segments: the predominance of software variation in genes encoding introvert proteins.

    PubMed

    Mitchison, A

    1997-01-01

    In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics. PMID:9148788

  14. Stable intronic sequence RNAs have possible regulatory roles in Drosophila melanogaster.

    PubMed

    Pek, Jun Wei; Osman, Ismail; Tay, Mandy Li-Ian; Zheng, Ruther Teo

    2015-10-26

    Stable intronic sequence RNAs (sisRNAs) have been found in Xenopus tropicalis, human cell lines, and Epstein-Barr virus; however, the biological significance of sisRNAs remains poorly understood. We identify sisRNAs in Drosophila melanogaster by deep sequencing, reverse transcription polymerase chain reaction, and Northern blotting. We characterize a sisRNA (sisR-1) from the regena (rga) locus and show that it can be processed from the precursor messenger RNA (pre-mRNA). We also document a cis-natural antisense transcript (ASTR) from the rga locus, which is highly expressed in early embryos. During embryogenesis, ASTR promotes robust rga pre-mRNA expression. Interestingly, sisR-1 represses ASTR, with consequential effects on rga pre-mRNA expression. Our results suggest a model in which sisR-1 modulates its host gene expression by repressing ASTR during embryogenesis. We propose that sisR-1 belongs to a class of sisRNAs with probable regulatory activities in Drosophila.

  15. Stable intronic sequence RNAs have possible regulatory roles in Drosophila melanogaster

    PubMed Central

    Osman, Ismail; Tay, Mandy Li-Ian; Zheng, Ruther Teo

    2015-01-01

    Stable intronic sequence RNAs (sisRNAs) have been found in Xenopus tropicalis, human cell lines, and Epstein-Barr virus; however, the biological significance of sisRNAs remains poorly understood. We identify sisRNAs in Drosophila melanogaster by deep sequencing, reverse transcription polymerase chain reaction, and Northern blotting. We characterize a sisRNA (sisR-1) from the regena (rga) locus and show that it can be processed from the precursor messenger RNA (pre-mRNA). We also document a cis-natural antisense transcript (ASTR) from the rga locus, which is highly expressed in early embryos. During embryogenesis, ASTR promotes robust rga pre-mRNA expression. Interestingly, sisR-1 represses ASTR, with consequential effects on rga pre-mRNA expression. Our results suggest a model in which sisR-1 modulates its host gene expression by repressing ASTR during embryogenesis. We propose that sisR-1 belongs to a class of sisRNAs with probable regulatory activities in Drosophila. PMID:26504165

  16. Regulatory aspects for translating gene therapy research into the clinic.

    PubMed

    Laurencot, Carolyn M; Ruppel, Sheryl

    2009-01-01

    Gene therapy products are highly regulated, therefore moving a promising candidate from the laboratory into the clinic can present unique challenges. Success can only be achieved by proper planning and communication within the clinical development team, as well as consultation with the regulatory scientists who will eventually review the clinical plan. Regulators should not be considered as obstacles but rather as collaborators whose advice can significantly expedite the product development. Sound scientific data is required and reviewed by the regulatory agencies to determine whether the potential benefit to the patient population outweighs the risk. Therefore, compliance with Good Manufacturing Practice (GMP) and Good Laboratory Practice (GLP) principles to ensure quality, safety, purity, and potency of the product, and to establish "proof of concept" for efficacy, and for safety information, respectively, is essential. The design and conduct of the clinical trial must adhere to Good Clinical Practice (GCP) principals. The clinical protocol should contain adequate rationale, supported by nonclinical data, to justify the starting dose and regimen, and adequate safety monitoring based on the patient population and the anticipated toxicities. Proper review and approval of gene therapy clinical studies by numerous committees, and regulatory agencies before and throughout the study allows for ongoing risk assessment of these novel and innovative products. The ethical conduct of clinical trials must be a priority for all clinical investigators and sponsors. As history has shown us, only a few fatal mistakes can dramatically alter the regulation of investigational products for all individuals involved in gene therapy clinical research, and further delay the advancement of gene therapy to licensed medicinal products.

  17. A Gene Regulatory Program in Human Breast Cancer.

    PubMed

    Li, Renhua; Campos, John; Iida, Joji

    2015-12-01

    Molecular heterogeneity in human breast cancer has challenged diagnosis, prognosis, and clinical treatment. It is well known that molecular subtypes of breast tumors are associated with significant differences in prognosis and survival. Assuming that the differences are attributed to subtype-specific pathways, we then suspect that there might be gene regulatory mechanisms that modulate the behavior of the pathways and their interactions. In this study, we proposed an integrated methodology, including machine learning and information theory, to explore the mechanisms. Using existing data from three large cohorts of human breast cancer populations, we have identified an ensemble of 16 master regulator genes (or MR16) that can discriminate breast tumor samples into four major subtypes. Evidence from gene expression across the three cohorts has consistently indicated that the MR16 can be divided into two groups that demonstrate subtype-specific gene expression patterns. For example, group 1 MRs, including ESR1, FOXA1, and GATA3, are overexpressed in luminal A and luminal B subtypes, but lowly expressed in HER2-enriched and basal-like subtypes. In contrast, group 2 MRs, including FOXM1, EZH2, MYBL2, and ZNF695, display an opposite pattern. Furthermore, evidence from mutual information modeling has congruently indicated that the two groups of MRs either up- or down-regulate cancer driver-related genes in opposite directions. Furthermore, integration of somatic mutations with pathway changes leads to identification of canonical genomic alternations in a subtype-specific fashion. Taken together, these studies have implicated a gene regulatory program for breast tumor progression.

  18. Evo–Devo in the Era of Gene Regulatory Networks

    PubMed Central

    Fischer, Antje H. L.; Smith, Joel

    2012-01-01

    Advanced genomics tools enable powerful new strategies for understanding complex biological processes, including development. By extension, we should be able to use these methods in a comparative fashion to capture evolutionary mechanisms. This requires a capacity to go deep and broad, to analyze developmental gene regulatory networks in many organisms, especially nontraditional models. As we usher in a new era of next-generation GRN (gene regulatory network) analysis, it is important to ask how to evaluate the evolution of network interactions. Particularly problematic, as always, is defining “independence”: Are two character traits found together because they are functionally linked or because of historical accident? The same basic question applies to understanding developmental GRN evolution. However, the essential difference here is that a GRN defines a causal chain of events. An understanding of causal relations—how Genes A and B work in concert to drive expression of Genes C and D to create a new Territory E—gives hope for establishing “trait independence” in a way that purely correlative arguments—the association of the expression of Gene D in Territory E—never could. Insight into causality provides the key to interpretation, as seen in this simplified scenario. Real-world networks bring new degrees of complexity, but the elucidation of causal relations remains the same. Has the day arrived when a single laboratory has the wherewithal to conduct multiorganism gene network projects in parallel? No. However, we argue that day is closer than one might suppose. We describe how a speedboat GRN project in one’s favorite nonmodel organism(s) might look and provide a framework for comparative network analysis. PMID:22927135

  19. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  20. Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data.

    PubMed

    Liu, Zhi-Ping

    2015-02-01

    Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented.

  1. Colorectal cancer risk genes are functionally enriched in regulatory pathways.

    PubMed

    Lu, Xi; Cao, Mingming; Han, Su; Yang, Youlin; Zhou, Jin

    2016-01-01

    Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. However, the potential genetic mechanisms for newly identified CRC susceptibility variants are still unclear. Here, we selected 85 CRC susceptibility variants with suggestive association P < 1.00E-05 from the National Human Genome Research Institute GWAS catalog. To investigate the underlying genetic pathways where these newly identified CRC susceptibility genes are significantly enriched, we conducted a functional annotation. Using two kinds of SNP to gene mapping methods including the nearest upstream and downstream gene method and the ProxyGeneLD, we got 128 unique CRC susceptibility genes. We then conducted a pathway analysis in GO database using the corresponding 128 genes. We identified 44 GO categories, 17 of which are regulatory pathways. We believe that our results may provide further insight into the underlying genetic mechanisms for these newly identified CRC susceptibility variants. PMID:27146020

  2. Identifying gene regulatory network rewiring using latent differential graphical models.

    PubMed

    Tian, Dechao; Gu, Quanquan; Ma, Jian

    2016-09-30

    Gene regulatory networks (GRNs) are highly dynamic among different tissue types. Identifying tissue-specific gene regulation is critically important to understand gene function in a particular cellular context. Graphical models have been used to estimate GRN from gene expression data to distinguish direct interactions from indirect associations. However, most existing methods estimate GRN for a specific cell/tissue type or in a tissue-naive way, or do not specifically focus on network rewiring between different tissues. Here, we describe a new method called Latent Differential Graphical Model (LDGM). The motivation of our method is to estimate the differential network between two tissue types directly without inferring the network for individual tissues, which has the advantage of utilizing much smaller sample size to achieve reliable differential network estimation. Our simulation results demonstrated that LDGM consistently outperforms other Gaussian graphical model based methods. We further evaluated LDGM by applying to the brain and blood gene expression data from the GTEx consortium. We also applied LDGM to identify network rewiring between cancer subtypes using the TCGA breast cancer samples. Our results suggest that LDGM is an effective method to infer differential network using high-throughput gene expression data to identify GRN dynamics among different cellular conditions.

  3. Identifying gene regulatory network rewiring using latent differential graphical models

    PubMed Central

    Tian, Dechao; Gu, Quanquan; Ma, Jian

    2016-01-01

    Gene regulatory networks (GRNs) are highly dynamic among different tissue types. Identifying tissue-specific gene regulation is critically important to understand gene function in a particular cellular context. Graphical models have been used to estimate GRN from gene expression data to distinguish direct interactions from indirect associations. However, most existing methods estimate GRN for a specific cell/tissue type or in a tissue-naive way, or do not specifically focus on network rewiring between different tissues. Here, we describe a new method called Latent Differential Graphical Model (LDGM). The motivation of our method is to estimate the differential network between two tissue types directly without inferring the network for individual tissues, which has the advantage of utilizing much smaller sample size to achieve reliable differential network estimation. Our simulation results demonstrated that LDGM consistently outperforms other Gaussian graphical model based methods. We further evaluated LDGM by applying to the brain and blood gene expression data from the GTEx consortium. We also applied LDGM to identify network rewiring between cancer subtypes using the TCGA breast cancer samples. Our results suggest that LDGM is an effective method to infer differential network using high-throughput gene expression data to identify GRN dynamics among different cellular conditions. PMID:27378774

  4. Colorectal cancer risk genes are functionally enriched in regulatory pathways

    PubMed Central

    Lu, Xi; Cao, Mingming; Han, Su; Yang, Youlin; Zhou, Jin

    2016-01-01

    Colorectal cancer (CRC) is a common complex disease caused by the combination of genetic variants and environmental factors. Genome-wide association studies (GWAS) have been performed and reported some novel CRC susceptibility variants. However, the potential genetic mechanisms for newly identified CRC susceptibility variants are still unclear. Here, we selected 85 CRC susceptibility variants with suggestive association P < 1.00E-05 from the National Human Genome Research Institute GWAS catalog. To investigate the underlying genetic pathways where these newly identified CRC susceptibility genes are significantly enriched, we conducted a functional annotation. Using two kinds of SNP to gene mapping methods including the nearest upstream and downstream gene method and the ProxyGeneLD, we got 128 unique CRC susceptibility genes. We then conducted a pathway analysis in GO database using the corresponding 128 genes. We identified 44 GO categories, 17 of which are regulatory pathways. We believe that our results may provide further insight into the underlying genetic mechanisms for these newly identified CRC susceptibility variants. PMID:27146020

  5. Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

    PubMed Central

    Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

    2012-01-01

    An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661

  6. Innovation and robustness in complex regulatory gene networks

    PubMed Central

    Ciliberti, S.; Martin, O. C.; Wagner, A.

    2007-01-01

    The history of life involves countless evolutionary innovations, a steady stream of ingenuity that has been flowing for more than 3 billion years. Very little is known about the principles of biological organization that allow such innovation. Here, we examine these principles for evolutionary innovation in gene expression patterns. To this end, we study a model for the transcriptional regulation networks that are at the heart of embryonic development. A genotype corresponds to a regulatory network of a given topology, and a phenotype corresponds to a steady-state gene expression pattern. Networks with the same phenotype form a connected graph in genotype space, where two networks are immediate neighbors if they differ by one regulatory interaction. We show that an evolutionary search on this graph can reach genotypes that are as different from each other as if they were chosen at random in genotype space, allowing evolutionary access to different kinds of innovation while staying close to a viable phenotype. Thus, although robustness to mutations may hinder innovation in the short term, we conclude that long-term innovation in gene expression patterns can only emerge in the presence of the robustness caused by connected genotype graphs. PMID:17690244

  7. LOESS correction for length variation in gene set-based genomic sequence analysis

    PubMed Central

    Aboukhalil, Anton; Bulyk, Martha L.

    2012-01-01

    Motivation: Sequence analysis algorithms are often applied to sets of DNA, RNA or protein sequences to identify common or distinguishing features. Controlling for sequence length variation is critical to properly score sequence features and identify true biological signals rather than length-dependent artifacts. Results: Several cis-regulatory module discovery algorithms exhibit a substantial dependence between DNA sequence score and sequence length. Our newly developed LOESS method is flexible in capturing diverse score-length relationships and is more effective in correcting DNA sequence scores for length-dependent artifacts, compared with four other approaches. Application of this method to genes co-expressed during Drosophila melanogaster embryonic mesoderm development or neural development scored by the Lever motif analysis algorithm resulted in successful recovery of their biologically validated cis-regulatory codes. The LOESS length-correction method is broadly applicable, and may be useful not only for more accurate inference of cis-regulatory codes, but also for detection of other types of patterns in biological sequences. Availability: Source code and compiled code are available from http://thebrain.bwh.harvard.edu/LM_LOESS/ Contact: mlbulyk@receptor.med.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22492312

  8. Cloning and characterization of nif structural and regulatory genes in the purple sulfur bacterium, Halorhodospira halophila.

    PubMed

    Tsuihiji, Hisayoshi; Yamazaki, Yoichi; Kamikubo, Hironari; Imamoto, Yasushi; Kataoka, Mikio

    2006-03-01

    Halorhodospira halophila is a halophilic photosynthetic bacterium classified as a purple sulfur bacterium. We found that H. halophila generates hydrogen gas during photoautotrophic growth as a byproduct of a nitrogenase reaction. In order to consider the applied possibilities of this photobiological hydrogen generation, we cloned and characterized the structural and regulatory genes encoding the nitrogenase, nifH, nifD and nifA, from H. halophila. This is the first description of the nif genes for a purple sulfur bacterium. The amino-acid sequences of NifH and NifD indicated that these proteins are an Fe protein and a part of a MoFe protein, respectively. The important residues are conserved completely. The sequence upstream from the nifH region and sequence similarities of nifH and nifD with those of the other organisms suggest that the regulatory system might be a NifL-NifA system; however, H. halophila lacks nifL. The amino-acid sequence of H. halophila NifA is closer to that of the NifA of the NifL-NifA system than to that of NifA without NifL. H. halophila NifA does not conserve either the residue that interacts with NifL or the important residues involved in NifL-independent regulation. These results suggest the existence of yet another regulatory system, and that the development of functional systems and their molecular counterparts are not necessarily correlated throughout evolution. All of these Nif proteins of H. halophila possess an excess of acidic residues, which acts as a salt-resistant mechanism.

  9. Gene and translation initiation site prediction in metagenomic sequences

    SciTech Connect

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John; Uberbacher, Edward C

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  10. Complete Genome Sequence of the Filamentous Fungus Aspergillus westerdijkiae Reveals the Putative Biosynthetic Gene Cluster of Ochratoxin A

    PubMed Central

    Chakrabortti, Alolika; Li, Jinming

    2016-01-01

    Ochratoxin A (OTA) is a common mycotoxin that contaminates food and agricultural products. Sequencing of the complete genome of Aspergillus westerdijkiae, a major producer of OTA, reveals more than 50 biosynthetic gene clusters, including a putative OTA biosynthetic gene cluster that encodes a dozen of enzymes, transporters, and regulatory proteins. PMID:27635003

  11. Complete Genome Sequence of the Filamentous Fungus Aspergillus westerdijkiae Reveals the Putative Biosynthetic Gene Cluster of Ochratoxin A.

    PubMed

    Chakrabortti, Alolika; Li, Jinming; Liang, Zhao-Xun

    2016-01-01

    Ochratoxin A (OTA) is a common mycotoxin that contaminates food and agricultural products. Sequencing of the complete genome of Aspergillus westerdijkiae, a major producer of OTA, reveals more than 50 biosynthetic gene clusters, including a putative OTA biosynthetic gene cluster that encodes a dozen of enzymes, transporters, and regulatory proteins. PMID:27635003

  12. Engineering nucleases for gene targeting: safety and regulatory considerations.

    PubMed

    Pauwels, Katia; Podevin, Nancy; Breyer, Didier; Carroll, Dana; Herman, Philippe

    2014-01-25

    Nuclease-based gene targeting (NBGT) represents a significant breakthrough in targeted genome editing since it is applicable from single-celled protozoa to human, including several species of economic importance. Along with the fast progress in NBGT and the increasing availability of customized nucleases, more data are available about off-target effects associated with the use of this approach. We discuss how NBGT may offer a new perspective for genetic modification, we address some aspects crucial for a safety improvement of the corresponding techniques and we also briefly relate the use of NBGT applications and products to the regulatory oversight.

  13. Mutations of epigenetic regulatory genes are common in thymic carcinomas.

    PubMed

    Wang, Yisong; Thomas, Anish; Lau, Christopher; Rajan, Arun; Zhu, Yuelin; Killian, J Keith; Petrini, Iacopo; Pham, Trung; Morrow, Betsy; Zhong, Xiaogang; Meltzer, Paul S; Giaccone, Giuseppe

    2014-12-08

    Genetic alterations and etiology of thymic epithelial tumors (TETs) are largely unknown, hampering the development of effective targeted therapies for patients with TETs. Here TETs of advanced-stage patients enrolled in a clinical trial of molecularly-guided targeted therapies were employed for targeted sequencing of 197 cancer-associated genes. Comparative sequence analysis of 78 TET/blood paired samples obtained from 47 thymic carcinoma (TC) and 31 thymoma patients revealed a total of 86 somatic non-synonymous sequence variations across 39 different genes in 33 (42%) TETs. TCs (62%; 29/47) showed higher incidence of somatic non-synonymous mutations than thymomas (13%; 4/31; p < 0.0001). TP53 was the most frequently mutated gene in TETs (n = 13; 17%), especially in TCs (26%), and was associated with a poorer overall survival (p < 0.0001). Genes in histone modification [BAP1 (n = 6; 13%), SETD2 (n = 5; 11%), ASXL1 (n = 2; 4%)], chromatin remodeling [SMARCA4 (n = 2; 4%)], and DNA methylation [DNMT3A (n = 3; 7%), TET2 (n = 2; 4%), WT1 (n = 2; 4%)] pathways were recurrently mutated in TCs, but not in thymomas. Our results suggest a potential disruption of epigenetic homeostasis in TCs, and a substantial difference in genetic makeup between TCs and thymomas. Further investigation is warranted into the roles of epigenetic dysregulation in TC development and its potential for targeted therapy.

  14. Mutations of epigenetic regulatory genes are common in thymic carcinomas

    PubMed Central

    Wang, Yisong; Thomas, Anish; Lau, Christopher; Rajan, Arun; Zhu, Yuelin; Killian, J. Keith; Petrini, Iacopo; Pham, Trung; Morrow, Betsy; Zhong, Xiaogang; Meltzer, Paul S.; Giaccone, Giuseppe

    2014-01-01

    Genetic alterations and etiology of thymic epithelial tumors (TETs) are largely unknown, hampering the development of effective targeted therapies for patients with TETs. Here TETs of advanced-stage patients enrolled in a clinical trial of molecularly-guided targeted therapies were employed for targeted sequencing of 197 cancer-associated genes. Comparative sequence analysis of 78 TET/blood paired samples obtained from 47 thymic carcinoma (TC) and 31 thymoma patients revealed a total of 86 somatic non-synonymous sequence variations across 39 different genes in 33 (42%) TETs. TCs (62%; 29/47) showed higher incidence of somatic non-synonymous mutations than thymomas (13%; 4/31; p < 0.0001). TP53 was the most frequently mutated gene in TETs (n = 13; 17%), especially in TCs (26%), and was associated with a poorer overall survival (p < 0.0001). Genes in histone modification [BAP1 (n = 6; 13%), SETD2 (n = 5; 11%), ASXL1 (n = 2; 4%)], chromatin remodeling [SMARCA4 (n = 2; 4%)], and DNA methylation [DNMT3A (n = 3; 7%), TET2 (n = 2; 4%), WT1 (n = 2; 4%)] pathways were recurrently mutated in TCs, but not in thymomas. Our results suggest a potential disruption of epigenetic homeostasis in TCs, and a substantial difference in genetic makeup between TCs and thymomas. Further investigation is warranted into the roles of epigenetic dysregulation in TC development and its potential for targeted therapy. PMID:25482724

  15. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks.

    PubMed

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-06-01

    The diverse, specialized genes present in today's lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins' binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes' evolutionary properties. Slowly evolving ("cold"), old genes tend to interact with each other, as do rapidly evolving ("hot"), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN's community structures and its genes' evolutionary properties provide new perspectives for understanding evolutionary genetics.

  16. Discovering transcription factor regulatory targets using gene expression and binding data

    PubMed Central

    Maienschein-Cline, Mark; Zhou, Jie; White, Kevin P.; Sciammas, Roger; Dinner, Aaron R.

    2012-01-01

    Motivation: Identifying the target genes regulated by transcription factors (TFs) is the most basic step in understanding gene regulation. Recent advances in high-throughput sequencing technology, together with chromatin immunoprecipitation (ChIP), enable mapping TF binding sites genome wide, but it is not possible to infer function from binding alone. This is especially true in mammalian systems, where regulation often occurs through long-range enhancers in gene-rich neighborhoods, rather than proximal promoters, preventing straightforward assignment of a binding site to a target gene. Results: We present EMBER (Expectation Maximization of Binding and Expression pRofiles), a method that integrates high-throughput binding data (e.g. ChIP-chip or ChIP-seq) with gene expression data (e.g. DNA microarray) via an unsupervised machine learning algorithm for inferring the gene targets of sets of TF binding sites. Genes selected are those that match overrepresented expression patterns, which can be used to provide information about multiple TF regulatory modes. We apply the method to genome-wide human breast cancer data and demonstrate that EMBER confirms a role for the TFs estrogen receptor alpha, retinoic acid receptors alpha and gamma in breast cancer development, whereas the conventional approach of assigning regulatory targets based on proximity does not. Additionally, we compare several predicted target genes from EMBER to interactions inferred previously, examine combinatorial effects of TFs on gene regulation and illustrate the ability of EMBER to discover multiple modes of regulation. Availability: All code used for this work is available at http://dinner-group.uchicago.edu/downloads.html Contact: dinner@uchicago.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:22084256

  17. Redeployment of a conserved gene regulatory network during Aedes aegypti development.

    PubMed

    Suryamohan, Kushal; Hanson, Casey; Andrews, Emily; Sinha, Saurabh; Scheel, Molly Duman; Halfon, Marc S

    2016-08-15

    Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system.

  18. Redeployment of a conserved gene regulatory network during Aedes aegypti development.

    PubMed

    Suryamohan, Kushal; Hanson, Casey; Andrews, Emily; Sinha, Saurabh; Scheel, Molly Duman; Halfon, Marc S

    2016-08-15

    Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system. PMID:27341759

  19. Diverse Gene Expression in Human Regulatory T Cell Subsets Uncovers Connection between Regulatory T Cell Genes and Suppressive Function.

    PubMed

    Hua, Jing; Davis, Scott P; Hill, Jonathan A; Yamagata, Tetsuya

    2015-10-15

    Regulatory T (Treg) cells have a critical role in the control of immunity, and their diverse subpopulations may allow adaptation to different types of immune responses. In this study, we analyzed human Treg cell subpopulations in the peripheral blood by performing genome-wide expression profiling of 40 Treg cell subsets from healthy donors. We found that the human peripheral blood Treg cell population is comprised of five major genomic subgroups, represented by 16 tractable subsets with a particular cell surface phenotype. These subsets possess a range of suppressive function and cytokine secretion and can exert a genomic footprint on target effector T (Teff) cells. Correlation analysis of variability in gene expression in the subsets identified several cell surface molecules associated with Treg suppressive function, and pharmacological interrogation revealed a set of genes having causative effect. The five genomic subgroups of Treg cells imposed a preserved pattern of gene expression on Teff cells, with a varying degree of genes being suppressed or induced. Notably, there was a cluster of genes induced by Treg cells that bolstered an autoinhibitory effect in Teff cells, and this induction appears to be governed by a different set of genes than ones involved in counteracting Teff activation. Our work shows an example of exploiting the diversity within human Treg cell subpopulations to dissect Treg cell biology. PMID:26371251

  20. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    PubMed

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  1. Inference of Gene Regulatory Network Based on Local Bayesian Networks

    PubMed Central

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan

    2016-01-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  2. Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences.

    PubMed

    Defrance, Matthieu; Janky, Rekin's; Sand, Olivier; van Helden, Jacques

    2008-01-01

    This protocol explains how to discover functional signals in genomic sequences by detecting over- or under-represented oligonucleotides (words) or spaced pairs thereof (dyads) with the Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/). Two typical applications are presented: (i) predicting transcription factor-binding motifs in promoters of coregulated genes and (ii) discovering phylogenetic footprints in promoters of orthologous genes. The steps of this protocol include purging genomic sequences to discard redundant fragments, discovering over-represented patterns and assembling them to obtain degenerate motifs, scanning sequences and drawing feature maps. The main strength of the method is its statistical ground: the binomial significance provides an efficient control on the rate of false positives. In contrast with optimization-based pattern discovery algorithms, the method supports the detection of under- as well as over-represented motifs. Computation times vary from seconds (gene clusters) to minutes (whole genomes). The execution of the whole protocol should take approximately 1 h.

  3. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  4. Graphlet Based Metrics for the Comparison of Gene Regulatory Networks

    PubMed Central

    Martin, Alberto J. M.; Dominguez, Calixto; Contreras-Riquelme, Sebastián; Holmes, David S.; Perez-Acle, Tomas

    2016-01-01

    Understanding the control of gene expression remains one of the main challenges in the post-genomic era. Accordingly, a plethora of methods exists to identify variations in gene expression levels. These variations underlay almost all relevant biological phenomena, including disease and adaptation to environmental conditions. However, computational tools to identify how regulation changes are scarce. Regulation of gene expression is usually depicted in the form of a gene regulatory network (GRN). Structural changes in a GRN over time and conditions represent variations in the regulation of gene expression. Like other biological networks, GRNs are composed of basic building blocks called graphlets. As a consequence, two new metrics based on graphlets are proposed in this work: REConstruction Rate (REC) and REC Graphlet Degree (RGD). REC determines the rate of graphlet similarity between different states of a network and RGD identifies the subset of nodes with the highest topological variation. In other words, RGD discerns how th GRN was rewired. REC and RGD were used to compare the local structure of nodes in condition-specific GRNs obtained from gene expression data of Escherichia coli, forming biofilms and cultured in suspension. According to our results, most of the network local structure remains unaltered in the two compared conditions. Nevertheless, changes reported by RGD necessarily imply that a different cohort of regulators (i.e. transcription factors (TFs)) appear on the scene, shedding light on how the regulation of gene expression occurs when E. coli transits from suspension to biofilm. Consequently, we propose that both metrics REC and RGD should be adopted as a quantitative approach to conduct differential analyses of GRNs. A tool that implements both metrics is available as an on-line web server (http://dlab.cl/loto). PMID:27695050

  5. Neurogenic gene regulatory pathways in the sea urchin embryo.

    PubMed

    Wei, Zheng; Angerer, Lynne M; Angerer, Robert C

    2016-01-15

    During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo.

  6. Neurogenic gene regulatory pathways in the sea urchin embryo.

    PubMed

    Wei, Zheng; Angerer, Lynne M; Angerer, Robert C

    2016-01-15

    During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo. PMID:26657764

  7. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks

    PubMed Central

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-01-01

    The diverse, specialized genes present in today’s lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins’ binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes’ evolutionary properties. Slowly evolving (“cold”), old genes tend to interact with each other, as do rapidly evolving (“hot”), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN’s community structures and its genes’ evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  8. Coding sequences of functioning human genes derived entirely from mobile element sequences.

    PubMed

    Britten, Roy J

    2004-11-30

    Among all of the many examples of mobile elements or "parasitic sequences" that affect the function of the human genome, this paper describes several examples of functioning genes whose sequences have been almost completely derived from mobile elements. There are many examples where the synthetic coding sequences of observed mRNA sequences are made up of mobile element sequences, to an extent of 80% or more of the length of the coding sequences. In the examples described here, the genes have named functions, and some of these functions have been studied. It appears that each of the functioning genes was originally formed from mobile elements and that in some process of molecular evolution a coding sequence was derived that could be translated into a protein that is of some importance to human biology. In one case (AD7C), the coding sequence is 99% made up of a cluster of Alu sequences. In another example, the gene BNIP3 coding sequence is 97% made up of sequences from an apparent human endogenous retrovirus. The Syncytin gene coding sequence appears to be made from an endogenous retrovirus envelope gene. PMID:15546984

  9. The influence of assortativity on the robustness and evolvability of gene regulatory networks upon gene birth

    PubMed Central

    Pechenick, Dov A.; Moore, Jason H.; Payne, Joshua L.

    2013-01-01

    Gene regulatory networks (GRNs) represent the interactions between genes and gene products, which drive the gene expression patterns that produce cellular phenotypes. GRNs display a number of characteristics that are beneficial for the development and evolution of organisms. For example, they are often robust to genetic perturbation, such as mutations in regulatory regions or loss of gene function. Simultaneously, GRNs are often evolvable as these genetic perturbations are occasionally exploited to innovate novel regulatory programs. Several topological properties, such as degree distribution, are known to influence the robustness and evolvability of GRNs. Assortativity, which measures the propensity of nodes of similar connectivity to connect to one another, is a separate topological property that has recently been shown to influence the robustness of GRNs to point mutations in cis-regulatory regions. However, it remains to be seen how assortativity may influence the robustness and evolvability of GRNs to other forms of genetic perturbation, such as gene birth via duplication or de novo origination. Here, we employ a computational model of genetic regulation to investigate whether the assortativity of a GRN influences its robustness and evolvability upon gene birth. We find that the robustness of a GRN generally increases with increasing assortativity, while its evolvability generally decreases. However, the rate of change in robustness outpaces that of evolvability, resulting in an increased proportion of assortative GRNs that are simultaneously robust and evolvable. By providing a mechanistic explanation for these observations, this work extends our understanding of how the assortativity of a GRN influences its robustness and evolvability upon gene birth. PMID:23542384

  10. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

    PubMed Central

    Oyserman, Ben O; Noguera, Daniel R; del Rio, Tijana Glavina; Tringe, Susannah G; McMahon, Katherine D

    2016-01-01

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. This analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms. PMID:26555245

  11. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis.

    PubMed

    Oyserman, Ben O; Noguera, Daniel R; del Rio, Tijana Glavina; Tringe, Susannah G; McMahon, Katherine D

    2016-04-01

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. This analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms. PMID:26555245

  12. A set of structural features defines the cis-regulatory modules of antenna-expressed genes in Drosophila melanogaster.

    PubMed

    López, Yosvany; Vandenbon, Alexis; Nakai, Kenta

    2014-01-01

    Unraveling the biological information within the regulatory region (RR) of genes has become one of the major focuses of current genomic research. It has been hypothesized that RRs of co-expressed genes share similar architecture, but to the best of our knowledge, no studies have simultaneously examined multiple structural features, such as positioning of cis-regulatory elements relative to transcription start sites and to each other, and the order and orientation of regulatory motifs, to accurately describe overall cis-regulatory structure. In our work we present an improved computational method that builds a feature collection based on all of these structural features. We demonstrate the utility of this approach by modeling the cis-regulatory modules of antenna-expressed genes in Drosophila melanogaster. Six potential antenna-related motifs were predicted initially, including three that appeared to be novel. A feature set was created with the predicted motifs, where a correlation-based filter was used to remove irrelevant features, and a genetic algorithm was designed to optimize the feature set. Finally, a set of eight highly informative structural features was obtained for the RRs of antenna-expressed genes, achieving an area under the curve of 0.841. We used these features to score all D. melanogaster RRs for potentially unknown antenna-expressed genes sharing a similar regulatory structure. Validation of our predictions with an independent RNA sequencing dataset showed that 76.7% of genes with high scoring RRs were expressed in antenna. In addition, we found that the structural features we identified are highly conserved in RRs of orthologs in other Drosophila sibling species. This approach to identify tissue-specific regulatory structures showed comparable performance to previous approaches, but also uncovered additional interesting features because it also considered the order and orientation of motifs.

  13. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification.

    PubMed

    Li, Cong-Jun; Li, Robert W; Baldwin, Ransom L; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  14. Transcriptomic Sequencing Reveals a Set of Unique Genes Activated by Butyrate-Induced Histone Modification

    PubMed Central

    Li, Cong-Jun; Li, Robert W.; Baldwin, Ransom L.; Blomberg, Le Ann; Wu, Sitao; Li, Weizhong

    2016-01-01

    Butyrate is a nutritional element with strong epigenetic regulatory activity as a histone deacetylase inhibitor. Based on the analysis of differentially expressed genes in the bovine epithelial cells using RNA sequencing technology, a set of unique genes that are activated only after butyrate treatment were revealed. A complementary bioinformatics analysis of the functional category, pathway, and integrated network, using Ingenuity Pathways Analysis, indicated that these genes activated by butyrate treatment are related to major cellular functions, including cell morphological changes, cell cycle arrest, and apoptosis. Our results offered insight into the butyrate-induced transcriptomic changes and will accelerate our discerning of the molecular fundamentals of epigenomic regulation. PMID:26819550

  15. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato.

    PubMed

    Fischer, Iris; Steige, Kim A; Stephan, Wolfgang; Mboup, Mamadou

    2013-01-01

    The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives.

  16. Regions in the promoter of the yeast FBP1 gene implicated in catabolite repression may bind the product of the regulatory gene MIG1.

    PubMed

    Mercado, J J; Vincent, O; Gancedo, J M

    1991-10-01

    We have identified in the promoter of the yeast FBP1 gene two sites able to bind nuclear proteins. These sites have a nucleotide sequence strongly similar to that of sites which bind the regulatory protein MIG1 in the promoters of GAL4 and SUC2. Deletions performed in the FBP1 promoter showed that one of the sites contributes to catabolite repression of this gene. In this same promoter, another region was identified with a strong effect on the catabolite repression of FBP1. In this region a sequence similar to the consensus for the binding site of the MIG1 protein was also present.

  17. Cis- and Trans-Regulatory Mechanisms of Gene Expression in the ASJ Sensory Neuron of Caenorhabditis elegans

    PubMed Central

    González-Barrios, María; Fierro-González, Juan Carlos; Krpelanova, Eva; Mora-Lorca, José Antonio; Pedrajas, José Rafael; Peñate, Xenia; Chavez, Sebastián; Swoboda, Peter; Jansen, Gert; Miranda-Vizuete, Antonio

    2015-01-01

    The identity of a given cell type is determined by the expression of a set of genes sharing common cis-regulatory motifs and being regulated by shared transcription factors. Here, we identify cis and trans regulatory elements that drive gene expression in the bilateral sensory neuron ASJ, located in the head of the nematode Caenorhabditis elegans. For this purpose, we have dissected the promoters of the only two genes so far reported to be exclusively expressed in ASJ, trx-1 and ssu-1. We hereby identify the ASJ motif, a functional cis-regulatory bipartite promoter region composed of two individual 6 bp elements separated by a 3 bp linker. The first element is a 6 bp CG-rich sequence that presumably binds the Sp family member zinc-finger transcription factor SPTF-1. Interestingly, within the C. elegans nervous system SPTF-1 is also found to be expressed only in ASJ neurons where it regulates expression of other genes in these neurons and ASJ cell fate. The second element of the bipartite motif is a 6 bp AT-rich sequence that is predicted to potentially bind a transcription factor of the homeobox family. Together, our findings identify a specific promoter signature and SPTF-1 as a transcription factor that functions as a terminal selector gene to regulate gene expression in C. elegans ASJ sensory neurons. PMID:25769980

  18. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  19. Regulatory elements in the 5'region of 16SrRNA gene of Bacillus sp. strain SJ­101

    PubMed Central

    Singh, Braj R; Al-Khedhairy, Abdulaziz A; Alarifi, Saud A; Musarrat, Javed

    2009-01-01

    Advancement in bioinformatics with the development of computational tools has enabled the in­silico prediction and identification of transcription regulatory factors and other genetic elements with great ease. In this study, computational analysis of sequence homology of 546 bp 5’ region of 16SrRNA gene of Bacillus sp. strain SJ­101 resulted in identification of promoter­like sequences within the rrn gene. Using BPROM tool, the regulatory motifs like -35 and -10 boxes were mapped at 392 and 411 positions, respectively. Furthermore, the cis-acting elements as the binding sites for transcription factors (TF) cpxR and argR were identified at positions 413 and 416 at the upstream of an open reading frame (ORF). The probable functions of the putative TFs were predicted through the Uni­Prot/Swiss­Prot protein database. Search for the Shine­Dalgarno sequence (SD) found the presence of highly conserved SD sequence (AATACC), and a short 42 bp coding sequence/ORF bounded with characteristic transcription start site (AAC) and a stop codon (TGA) at positions 426 and 465 downstream to the promoter elements. A 13 amino acid long translation product of a short ORF has exhibited 100% homology with protein sequences of Bacillus spp., while showing some degree of polymorphism with other reference strains. The comparative homology of the small protein exhibited maximum similarity with Prolyl­4 hydroxylase of Chlamydomonas reinhardtii with 4.11 ZSCORE. The highly conserved regulatory elements and the putative ORF predicted within the 16SrRNA gene may help understand the role of relatively unexplored short ORFs within rrn operon, and their functional products in genetic regulatory mechanisms in eubacteria. PMID:19759811

  20. Flanking regulatory sequences of the locus encoding the murine GDNF receptor, c-ret, directs lac Z (beta-galactosidase) expression in developing somatosensory system.

    PubMed

    Sukumaran, M; Waxman, S G; Wood, J N; Pachnis, V

    2001-11-01

    RET forms the catalytic component within the receptor complex that transmits signals from the GDNF family of neurotrophic factors. To study the mechanisms regulating the cell-type specific expression of this gene, we have cloned and characterised the murine c-ret locus. A cosmid contig comprising approximately 60 kb of the mouse genome encompassing the entire structural gene and flanking sequences have been isolated and the transcription initiation site identified and promoter characterised. The murine c-ret promoter lacks a TATA initiation motif and has GC enriched DNA sequences reminiscent of CpG islands. Analysis of transgenic mice lines bearing the Lac Z (beta-galactosidase) reporter gene under the control of 5' flanking sequences show modularity in the organisation of cis-regulatory domains within the locus. Cloned 5' flanking sequences comprise a distal regulatory domain directing Lac Z expression at the primitive streak, lateral mesoderm and facial ganglia and a proximal sensory neurones specific regulatory domain inducing Lac Z expression primarily within the developing somatosensory system. The spatial and temporal progression of transgene expression precisely recapitulates endogenous gene expression in developing sensory ganglia including its induction in postnatal Isolectin B4 binding nociceptive neurones. PMID:11747074

  1. The naphthalene catabolic (nag) genes of Polaromonas naphthalenivorans CJ2: Evolutionary implications for two gene clusters and novel regulatory control

    SciTech Connect

    Jeon, C.O.; Park, M.; Ro, H.S.; Park, W.; Madsen, E.L.

    2006-02-15

    Polaromonas naphthalenivorans CJ2, found to be responsible for the degradation of naphthalene in situ at a coal tar waste-contaminated site, is able to grow on mineral salts agar media with naphthalene as the sole carbon source. Beginning from a 484-bp nagAc-like region, we used a genome walking strategy to sequence genes encoding the entire naphthalene degradation pathway and additional flanking regions. We found that the naphthalene catabolic genes in P. naphthalenivorans CJ2 were divided into one large and one small gene cluster, separated by an unknown distance. The large gene cluster is bounded by a LysR-type regulator (nagR). The small cluster is bounded by a MarR-type regulator (nagR2). The catabolic genes of P. naphthalenivorans CJ2 were homologous to many of those of Ralstonia U2, which uses the gentisate pathway to convert naphthalene to central metabolites. However, three open reading frames (nagY, nagM, and nagN), present in Ralstonia U2, were absent. Also, P. naphthalenivorans carries two copies of gentisate dioxygenase (nagI) with 77.4% DNA sequence identity to one another and 82% amino acid identity to their homologue in Ralstonia sp. strain U2. Investigation of the operons using reverse transcription PCR showed that each cluster was controlled independently by its respective promoter. Insertional inactivation and lacZ reporter assays showed that nagR2 is a negative regulator and that expression of the small cluster is not induced by naphthalene, salicylate, or gentisate. Association of two putative Azoarcus-related transposases with the large cluster and one Azoarcus-related putative salicylate 5-hydroxylase gene (ORF2) in the small cluster suggests that mobile genetic elements were likely involved in creating the novel arrangement of catabolic and regulatory genes in P. naphthalenivorans.

  2. Novel phytochrome sequences in Arabidopsis thaliana: Structure, evolution, and differential expression of a plant regulatory photoreceptor family

    SciTech Connect

    Sharrock, R.A.; Quail, P.H. )

    1989-01-01

    Phytochrome is a plant regulatory photoreceptor that mediates red light effects on a wide variety of physiological and molecular responses. DNA blot analysis indicates that the Arabidopsis thaliana genome contains four to five phytochrome-related gene sequences. The authors have isolated and sequenced cDNA clones corresponding to three of these genes and have deduced the amino acid sequence of the full-length polypeptide encoded in each case. One of these proteins (phyA) shows 65-80% amino acid sequence identity with the major, etiolated-tissue phytochrome apoproteins described previously in other plant species. The other two polypeptides (phyB and phyC) are unique in that they have low sequence identity with each other, with phyA, and with all previously described phytochromes. The phyA, phyB, and phyC proteins are of similar molecular mass, have related hydropathic profiles, and contain a conserved chromophore attachment region. However, the sequence comparison data indicate that the three phy genes diverged early in plant evolution, well before the divergence of the two major groups of angiosperms, the monocots and dicots. The steady-state level of the phyA transcript is high in dark-grown A. thaliana seedlings and is down-regulated by light. In contrast, the phyB and phyC transcripts are present at lower levels and are not strongly light-regulated. These findings indicate that the red/far red light-responsive phytochrome photoreceptor system in A. thaliana, and perhaps in all higher plants, consists of a family of chromoproteins that are heterogeneous in structure and regulation.

  3. Complex Dynamic Behavior in Simple Gene Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Santillán Zerón, Moisés

    2007-02-01

    Knowing the complete genome of a given species is just a piece of the puzzle. To fully unveil the systems behavior of an organism, an organ, or even a single cell, we need to understand the underlying gene regulatory dynamics. Given the complexity of the whole system, the ultimate goal is unattainable for the moment. But perhaps, by analyzing the most simple genetic systems, we may be able to develop the mathematical techniques and procedures required to tackle more complex genetic networks in the near future. In the present work, the techniques for developing mathematical models of simple bacterial gene networks, like the tryptophan and lactose operons are introduced. Despite all of the underlying assumptions, such models can provide valuable information regarding gene regulation dynamics. Here, we pay special attention to robustness as an emergent property. These notes are organized as follows. In the first section, the long historical relation between mathematics, physics, and biology is briefly reviewed. Recently, the multidisciplinary work in biology has received great attention in the form of systems biology. The main concepts of this novel science are discussed in the second section. A very slim introduction to the essential concepts of molecular biology is given in the third section. In the fourth section, a brief introduction to chemical kinetics is presented. Finally, in the fifth section, a mathematical model for the lactose operon is developed and analyzed..

  4. Identifying sleep regulatory genes using a Drosophila model of insomnia

    PubMed Central

    Seugnet, Laurent; Suzuki, Yasuko; Thimgan, Matthew; Donlea, Jeff; Gimbel, Sarah I.; Gottschalk, Laura; Duntley, Steve P.; Shaw, Paul J.

    2009-01-01

    Although it is widely accepted that sleep must serve an essential biological function, little is known about molecules that underlie sleep regulation. Given that insomnia is a common sleep disorder that disrupts the ability to initiate and maintain restorative sleep, a better understanding of its molecular underpinning may provide crucial insights into sleep regulatory processes. Thus, we created a line of flies using laboratory selection that share traits with human insomnia. After 60 generations insomnia-like (ins-l) flies sleep 60 min a day, exhibit difficulty initiating sleep, difficulty maintaining sleep, and show evidence of daytime cognitive impairment. ins-l flies are also hyperactive and hyper responsive to environmental perturbations. In addition they have difficulty maintaining their balance, have elevated levels of dopamine, are short-lived and show increased levels of triglycerides, cholesterol, and free fatty acids. While their core molecular clock remains intact, ins-l flies lose their ability to sleep when placed into constant darkness. Whole genome profiling identified genes that are modified in ins-l flies. Among those differentially expressed transcripts genes involved in metabolism, neuronal activity, and sensory perception constituted over-represented categories. We demonstrate that two of these genes are upregulated in human subjects following acute sleep deprivation. Together these data indicate that the ins-l flies are a useful tool that can be used to identify molecules important for sleep regulation and may provide insights into both the causes and long-term consequences of insomnia. PMID:19494137

  5. Stochastic S-system modeling of gene regulatory network.

    PubMed

    Chowdhury, Ahsan Raja; Chetty, Madhu; Evans, Rob

    2015-10-01

    Microarray gene expression data can provide insights into biological processes at a system-wide level and is commonly used for reverse engineering gene regulatory networks (GRN). Due to the amalgamation of noise from different sources, microarray expression profiles become inherently noisy leading to significant impact on the GRN reconstruction process. Microarray replicates (both biological and technical), generated to increase the reliability of data obtained under noisy conditions, have limited influence in enhancing the accuracy of reconstruction . Therefore, instead of the conventional GRN modeling approaches which are deterministic, stochastic techniques are becoming increasingly necessary for inferring GRN from noisy microarray data. In this paper, we propose a new stochastic GRN model by investigating incorporation of various standard noise measurements in the deterministic S-system model. Experimental evaluations performed for varying sizes of synthetic network, representing different stochastic processes, demonstrate the effect of noise on the accuracy of genetic network modeling and the significance of stochastic modeling for GRN reconstruction . The proposed stochastic model is subsequently applied to infer the regulations among genes in two real life networks: (1) the well-studied IRMA network, a real-life in-vivo synthetic network constructed within the Saccharomyces cerevisiae yeast, and (2) the SOS DNA repair network in Escherichia coli.

  6. Gene-regulatory activity of alpha-tocopherol.

    PubMed

    Rimbach, Gerald; Moehring, Jennifer; Huebbe, Patricia; Lodge, John K

    2010-03-01

    Vitamin E is an essential vitamin and a lipid soluble antioxidant, at least, under in vitro conditions. The antioxidant properties of vitamin E are exerted through its phenolic hydroxyl group, which donates hydrogen to peroxyl radicals, resulting in the formation of stable lipid species. Beside an antioxidant role, important cell signalling properties of vitamin E have been described. By using gene chip technology we have identified alpha-tocopherol sensitive molecular targets in vivo including christmas factor (involved in the blood coagulation) and 5alpha-steroid reductase type 1 (catalyzes the conversion of testosterone to 5alpha-dihydrotestosterone) being upregulated and gamma-glutamyl-cysteinyl synthetase (the rate limiting enzyme in GSH synthesis) being downregulated due to alpha-tocopherol deficiency. Alpha-tocopherol regulates signal transduction cascades not only at the mRNA but also at the miRNA level since miRNA 122a (involved in lipid metabolism) and miRNA 125b (involved in inflammation) are downregulated by alpha-tocopherol. Genetic polymorphisms may determine the biological and gene-regulatory activity of alpha-tocopherol. In this context we have recently shown that genes encoding for proteins involved in peripheral alpha-tocopherol transport and degradation are significantly affected by the apoE genotype.

  7. Sequence determinants of prokaryotic gene expression level under heat stress.

    PubMed

    Xiong, Heng; Yang, Yi; Hu, Xiao-Pan; He, Yi-Ming; Ma, Bin-Guang

    2014-11-01

    Prokaryotic gene expression is environment-dependent and temperature plays an important role in shaping the gene expression profile. Revealing the regulation mechanisms of gene expression pertaining to temperature has attracted tremendous efforts in recent years particularly owning to the yielding of transcriptome and proteome data by high-throughput techniques. However, most of the previous works concentrated on the characterization of the gene expression profile of individual organism and little effort has been made to disclose the commonality among organisms, especially for the gene sequence features. In this report, we collected the transcriptome and proteome data measured under heat stress condition from recently published literature and studied the sequence determinants for the expression level of heat-responsive genes on multiple layers. Our results showed that there indeed exist commonness and consistent patterns of the sequence features among organisms for the differentially expressed genes under heat stress condition. Some features are attributed to the requirement of thermostability while some are dominated by gene function. The revealed sequence determinants of bacterial gene expression level under heat stress complement the knowledge about the regulation factors of prokaryotic gene expression responding to the change of environmental conditions. Furthermore, comparisons to thermophilic adaption have been performed to reveal the similarity and dissimilarity of the sequence determinants for the response to heat stress and for the adaption to high habitat temperature, which elucidates the complex landscape of gene expression related to the same physical factor of temperature.

  8. Stability Depends on Positive Autoregulation in Boolean Gene Regulatory Networks

    PubMed Central

    Pinho, Ricardo; Garcia, Victor; Irimia, Manuel; Feldman, Marcus W.

    2014-01-01

    Network motifs have been identified as building blocks of regulatory networks, including gene regulatory networks (GRNs). The most basic motif, autoregulation, has been associated with bistability (when positive) and with homeostasis and robustness to noise (when negative), but its general importance in network behavior is poorly understood. Moreover, how specific autoregulatory motifs are selected during evolution and how this relates to robustness is largely unknown. Here, we used a class of GRN models, Boolean networks, to investigate the relationship between autoregulation and network stability and robustness under various conditions. We ran evolutionary simulation experiments for different models of selection, including mutation and recombination. Each generation simulated the development of a population of organisms modeled by GRNs. We found that stability and robustness positively correlate with autoregulation; in all investigated scenarios, stable networks had mostly positive autoregulation. Assuming biological networks correspond to stable networks, these results suggest that biological networks should often be dominated by positive autoregulatory loops. This seems to be the case for most studied eukaryotic transcription factor networks, including those in yeast, flies and mammals. PMID:25375153

  9. Eric Davidson: Steps to a gene regulatory network for development.

    PubMed

    Rothenberg, Ellen V

    2016-04-15

    Eric Harris Davidson was a unique and creative intellectual force who grappled with the diversity of developmental processes used by animal embryos and wrestled them into an intelligible set of principles, then spent his life translating these process elements into molecularly definable terms through the architecture of gene regulatory networks. He took speculative risks in his theoretical writing but ran a highly organized, rigorous experimental program that yielded an unprecedentedly full characterization of a developing organism. His writings created logical order and a framework for mechanism from the complex phenomena at the heart of advanced multicellular organism development. This is a reminiscence of intellectual currents in his work as observed by the author through the last 30-35 years of Davidson's life. PMID:26825392

  10. An evolutionary constraint: strongly disfavored class of change in DNA sequence during divergence of cis-regulatory modules.

    PubMed

    Cameron, R Andrew; Chow, Suk Hen; Berney, Kevin; Chiu, Tsz-Yeung; Yuan, Qiu-Autumn; Krämer, Alexander; Helguero, Argelia; Ransick, Andrew; Yun, Mirong; Davidson, Eric H

    2005-08-16

    The DNA of functional cis-regulatory modules displays extensive sequence conservation in comparisons of genomes from modestly distant species. Patches of sequence that are several hundred base pairs in length within these modules are often seen to be 80-95% identical, although the flanking sequence cannot even be aligned. However, it is unlikely that base pairs located between the transcription factor target sites of cis-regulatory modules have sequence-dependent function, and the mechanism that constrains evolutionary change within cis-regulatory modules is incompletely understood. We chose five functionally characterized cis-regulatory modules from the Strongylocentrotus purpuratus (sea urchin) genome and obtained orthologous regulatory and flanking sequences from a bacterial artificial chromosome genome library of a congener, Strongylocentrotus franciscanus. As expected, single-nucleotide substitutions and small indels occur freely at many positions within the regulatory modules of these two species, as they do outside the regulatory modules. However, large indels (>20 bp) are statistically almost absent within the regulatory modules, although they are common in flanking intergenic or intronic sequence. The result helps to explain the patterns of evolutionary sequence divergence characteristic of cis-regulatory DNA.

  11. Discovery of sequence motifs related to coexpression of genes using evolutionary computation

    PubMed Central

    Fogel, Gary B.; Weekes, Dana G.; Varga, Gabor; Dow, Ernst R.; Harlow, Harry B.; Onyia, Jude E.; Su, Chen

    2004-01-01

    Transcription factors are key regulatory elements that control gene expression. Recognition of transcription factor binding site (TFBS) motifs in the upstream region of coexpressed genes is therefore critical towards a true understanding of the regulations of gene expression. The task of discovering eukaryotic TFBSs remains a challenging problem. Here, we demonstrate that evolutionary computation can be used to search for TFBSs in upstream regions of genes known to be coexpressed. Evolutionary computation was used to search for TFBSs of genes regulated by octamer-binding factor and nuclear factor kappa B. The discovered binding sites included experimentally determined known binding motifs as well as lists of putative, previously unknown TFBSs. We believe that this method to search nucleotide sequence information efficiently for similar motifs will be useful for discovering TFBSs that affect gene regulation. PMID:15266008

  12. Messenger RNA in dormant cells of Sterkiella histriomuscorum (Oxytrichiade): indentification of putative regulatory gene transcripts.

    PubMed

    Tourancheau, A B; Morin, L; Yang, T; Perasso, R

    1999-08-01

    In the absence of food, the oxytrichid Sterkiella histriomuscorum, like many ciliates, enters into dormancy and transforms into a round and walled encysted cell. When transferred back into a feeding medium, the cyst re-transforms into a vegetative cell in a few hours. This encystment-excystment pathway, which is common to many free-living and parasitic protists, is still poorly understood at the molecular level. In order to identify potential dormant transcripts in the cysts of Sterkiella, we have constructed cDNA libraries from mature cysts. Transcripts have been isolated confirming the presence of a mRNA pool in the dormant cells. The sequence analysis of two cDNA indicates open reading frames which show significant similarities to known proteins involved in mechanisms of regulation: 1) nifR3, an element of the nitrogen regulatory system in bacteria and 2) CROC-1, a newly identified human transcription factor. The two corresponding macronuclear genes represent the first putative regulatory genes isolated in ciliates. From a differential screening of the cDNA library against vegetative cDNA, one cyst-specific (and very abundant) transcript has been isolated but the product has not yet been identified. The possible involvment of these new ciliate genes in the excystment process is discussed.

  13. The analysis of Gene Regulatory Networks in plant evo-devo.

    PubMed

    Vialette-Guiraud, Aurélie C M; Andres-Robin, Amélie; Chambrier, Pierre; Tavares, Raquel; Scutt, Charles P

    2016-04-01

    We provide an overview of methods and workflows that can be used to investigate the topologies of Gene Regulatory Networks (GRNs) in the context of plant evolutionary-developmental (evo-devo) biology. Many of the species that occupy key positions in plant phylogeny are poorly adapted as laboratory models and so we focus here on techniques that can be efficiently applied to both model and non-model species of interest to plant evo-devo. We outline methods that can be used to describe gene expression patterns and also to elucidate the transcriptional, post-transcriptional, and epigenetic regulatory mechanisms underlying these patterns, in any plant species with a sequenced genome. We furthermore describe how the technique of Protein Resurrection can be used to confirm inferences on ancestral GRNs and also to provide otherwise-inaccessible points of reference in evolutionary histories by exploiting paralogues generated in gene and whole genome duplication events. Finally, we argue for the better integration of molecular data with information from paleobotanical, paleoecological, and paleogeographical studies to provide the fullest possible picture of the processes that have shaped the evolution of plant development. PMID:27006484

  14. Genome-Wide Analysis of Wilms’ Tumor 1-Controlled Gene Expression in Podocytes Reveals Key Regulatory Mechanisms

    PubMed Central

    Kann, Martin; Ettou, Sandrine; Jung, Youngsook L.; Lenz, Maximilian O.; Taglienti, Mary E.; Park, Peter J.; Schermer, Bernhard

    2015-01-01

    The transcription factor Wilms’ tumor suppressor 1 (WT1) is key to podocyte development and viability; however, WT1 transcriptional networks in podocytes remain elusive. We provide a comprehensive analysis of the genome-wide WT1 transcriptional network in podocytes in vivo using chromatin immunoprecipitation followed by sequencing (ChIPseq) and RNA sequencing techniques. Our data show a specific role for WT1 in regulating the podocyte-specific transcriptome through binding to both promoters and enhancers of target genes. Furthermore, we inferred a podocyte transcription factor network consisting of WT1, LMX1B, TCF21, Fox-class and TEAD family transcription factors, and MAFB that uses tissue-specific enhancers to control podocyte gene expression. In addition to previously described WT1-dependent target genes, ChIPseq identified novel WT1-dependent signaling systems. These targets included components of the Hippo signaling system, underscoring the power of genome-wide transcriptional-network analyses. Together, our data elucidate a comprehensive gene regulatory network in podocytes suggesting that WT1 gene regulatory function and podocyte cell-type specification can best be understood in the context of transcription factor-regulatory element network interplay. PMID:25636411

  15. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  16. Promoter elements determining weak expression of the GAL4 regulatory gene of Saccharomyces cerevisiae.

    PubMed Central

    Griggs, D W; Johnston, M

    1993-01-01

    The GAL4 gene of Saccharomyces cerevisiae (encoding the activator of transcription of the GAL genes) is poorly expressed and is repressed during growth on glucose. To determine the basis for its weak expression and to identify DNA sequences recognized by proteins that activate transcription of a gene that itself encodes an activator of transcription, we have analyzed GAL4 promoter structure. We show that the GAL4 promoter is about 90-fold weaker than the strong GAL1 promoter and at least 7-fold weaker than the feeble URA3 promoter and that this low level of GAL4 expression is primarily due to a weak promoter. By deletion mapping, the GAL4 promoter can be divided into three functional regions. Two of these regions contain positive elements; a distal region termed the UASGAL4 (upstream activation sequence) contains redundant elements that increase promoter function, and a central region termed the UESGAL4 (upstream essential sequence) is essential for even basal levels of GAL4 expression. The third element, an upstream repression sequence, mediates glucose repression of GAL4 expression and is located between the UES and the transcriptional start site. The UASGAL4 is unusual because it is not interchangable with UAS elements in other yeast promoters; it does not function as a UAS element when inserted in a CYC1 promoter, and a normally strong UAS functions poorly in place of UASGAL4 in the GAL4 promoter. Similarly, the UES element of GAL4 does not function as a TATA element in a test promoter, and consensus TATA elements do not function in place of UES elements in the GAL4 promoter. These results suggest that GAL4 contains a weak TATA-less promoter and that the proteins regulating expression of this regulatory gene may be novel and context specific. PMID:8393142

  17. In silico analysis of the regulatory region of the Yellowtail Kingfish and Zebrafish Kiss and Kiss receptor genes.

    PubMed

    Nocillado, J N; Mechaly, A S; Elizur, A

    2013-02-01

    We have cloned and analysed the partial putative promoter sequences of the Yellowtail Kingfish (Seriola lalandi) Kiss2 and Kiss2r genes (380 and 420 bp, respectively). We obtained in silico 1.5 kb of the zebrafish (Danio rerio) Kiss1, Kiss2, Kiss1r and zfKiss2r sequences upstream of the putative transcriptional initiation site. Bioinformatic analysis revealed promoter regulatory elements including AP-1, Sp1, GR, ER, PR, AR, GATA-1, TTF-1, YY1 and C/EBP. These regulatory elements may mediate novel roles of the Kiss genes and their receptors in addition to their established role in reproductive function. PMID:22527613

  18. cis-acting regulatory elements within gag genes of avian retroviruses.

    PubMed Central

    Arrigo, S; Yun, M; Beemon, K

    1987-01-01

    A cis-acting enhancer element has been detected within the gag gene of several avian retroviruses, including Rous sarcoma virus, Fujinami sarcoma virus, and the endogenous Rous-associated virus-0. A consensus enhancer core sequence, GTGGTTTG, is present in all of these viral genomes, approximately 900 bases downstream from the site of initiation of transcription. When an internal fragment derived from the gag gene of any of these viruses (spanning nucleotides 533 to approximately 1149) was inserted into a plasmid containing the chloramphenicol acetyltransferase (cat) gene under control of the simian virus 40 promoter, 9- or 21-fold enhancement of CAT expression was observed after transfection into mouse L cells and chicken embryo fibroblasts, respectively. This enhancement was not dependent on the position of insertion of the gag fragment into the plasmid. However, there was a strong dependence on orientation, with higher levels of CAT expression in constructs in which the 5' end of the gag fragment was nearest to the promoter, suggesting a possible negative regulatory element at the 3' end of this fragment. Deletion of the 3' end of the insert resulted in a gag fragment, containing nucleotides 533 to 1017, which enhanced expression equally in either orientation. When the gag fragment was inserted into a plasmid containing the cat gene under the control of an intact Rous sarcoma virus long terminal repeat, it induced a two- to threefold increase in CAT activity and CAT mRNA levels. Translation of the gag fragment did not appear to be necessary for the observed enhancement, since two insertional mutations resulting in frameshifts in the gag insert did not affect CAT expression. However, deletion of a 330-base internal fragment from the gag insert restored a basal level of CAT activity. These results suggest that retroviruses have regulatory elements within their genes distinct from those in the long terminal repeats that flank the genes. Images PMID:3031470

  19. Improving the safety of viral DNA vaccines: development of vectors containing both 5' and 3' homologous regulatory sequences from non-viral origin.

    PubMed

    Martinez-Lopez, A; Encinas, P; García-Valtanen, P; Gomez-Casado, E; Coll, J M; Estepa, A

    2013-04-01

    Although some DNA vaccines have proved to be very efficient in field trials, their authorisation still remains limited to a few countries. This is in part due to safety issues because most of them contain viral regulatory sequences to driving the expression of the encoded antigen. This is the case of the only DNA vaccine against a fish rhabdovirus (a negative ssRNA virus), authorised in Canada, despite the important economic losses that these viruses cause to aquaculture all over the world. In an attempt to solve this problem and using as a model a non-authorised, but efficient DNA vaccine against the fish rhabdovirus, viral haemorrhagic septicaemia virus (VHSV), we developed a plasmid construction containing regulatory sequences exclusively from fish origin. The result was an "all-fish vector", named pJAC-G, containing 5' and 3' regulatory sequences of β-acting genes from carp and zebrafish, respectively. In vitro and in vivo, pJAC-G drove a successful expression of the VHSV glycoprotein G (G), the only antigen of the virus conferring in vivo protection. Furthermore, and by means of in vitro fusion assays, it was confirmed that G protein expressed from pJAC-G was fully functional. Altogether, these results suggest that DNA vaccines containing host-homologous gene regulatory sequences might be useful for developing safer DNA vaccines, while they also might be useful for basic studies.

  20. Phase-defined complete sequencing of the HLA genes by next-generation sequencing

    PubMed Central

    2013-01-01

    Background The human leukocyte antigen (HLA) region, the 3.8-Mb segment of the human genome at 6p21, has been associated with more than 100 different diseases, mostly autoimmune diseases. Due to the complex nature of HLA genes, there are difficulties in elucidating complete HLA gene sequences especially HLA gene haplotype structures by the conventional sequencing method. We propose a novel, accurate, and cost-effective method for generating phase-defined complete sequencing of HLA genes by using indexed multiplex next generation sequencing. Results A total of 33 HLA homozygous samples, 11 HLA heterozygous samples, and 3 parents-child families were subjected to phase-defined HLA gene sequencing. We applied long-range PCR to amplify six HLA genes (HLA-A, -C, -B, DRB1, -DQB1, and –DPB1) followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. Paired-end reads (2 × 250 bp) derived from the sequencer were aligned to the six HLA gene segments of UCSC hg19 allowing at most 80 bases mismatch. For HLA homozygous samples, the six amplicons of an individual were pooled and simultaneously sequenced and mapped as an individual-tagging method. The paired-end reads were aligned to corresponding genes of UCSC hg19 and unambiguous, continuous sequences were obtained. For HLA heterozygous samples, each amplicon was separately sequenced and mapped as a gene-tagging method. After alignments, we detected informative paired-end reads harboring SNVs on both forward and reverse reads that are used to separate two chromosomes and to generate two phase-defined sequences in an individual. Consequently, we were able to determine the phase-defined HLA gene sequences from promoter to 3′-UTR and assign up to 8-digit HLA allele numbers, regardless of whether the alleles are rare or novel. Parent–child trio-based sequencing validated our sequencing and phasing methods. Conclusions Our protocol generated phased-defined sequences of the entire

  1. Evolution of the CNS myelin gene regulatory program.

    PubMed

    Li, Huiliang; Richardson, William D

    2016-06-15

    Myelin is a specialized subcellular structure that evolved uniquely in vertebrates. A myelinated axon conducts action potentials many times faster than an unmyelinated axon of the same diameter; for the same conduction speed, the unmyelinated axon would need a much larger diameter and volume than its myelinated counterpart. Hence myelin speeds information transfer and saves space, allowing the evolution of a powerful yet portable brain. Myelination in the central nervous system (CNS) is controlled by a gene regulatory program that features a number of master transcriptional regulators including Olig1, Olig2 and Myrf. Olig family genes evolved from a single ancestral gene in non-chordates. Olig2, which executes multiple functions with regard to oligodendrocyte identity and development in vertebrates, might have evolved functional versatility through post-translational modification, especially phosphorylation, as illustrated by its evolutionarily conserved serine/threonine phospho-acceptor sites and its accumulation of serine residues during more recent stages of vertebrate evolution. Olig1, derived from a duplicated copy of Olig2 in early bony fish, is involved in oligodendrocyte development and is critical to remyelination in bony vertebrates, but is lost in birds. The origin of Myrf orthologs might be the result of DNA integration between an invading phage or bacterium and an early protist, producing a fusion protein capable of self-cleavage and DNA binding. Myrf seems to have adopted new functions in early vertebrates - initiation of the CNS myelination program as well as the maintenance of mature oligodendrocyte identity and myelin structure - by developing new ways to interact with DNA motifs specific to myelin genes. This article is part of a Special Issue entitled SI: Myelin Evolution.

  2. ZNF143 provides sequence specificity to secure chromatin interactions at gene promoters

    PubMed Central

    Bailey, Swneke D.; Zhang, Xiaoyang; Desai, Kinjal; Aid, Malika; Corradin, Olivia; Cowper-Sal·lari, Richard; Akhtar-Zaidi, Batool; Scacheri, Peter C.; Haibe-Kains, Benjamin; Lupien, Mathieu

    2015-01-01

    Chromatin interactions connect distal regulatory elements to target gene promoters guiding stimulus- and lineage-specific transcription. Few factors securing chromatin interactions have so far been identified. Here by integrating chromatin interaction maps with the large collection of transcription factor binding profiles provided by the ENCODE project, we demonstrate that the zinc-finger protein ZNF143 preferentially occupies anchors of chromatin interactions connecting promoters with distal regulatory elements. It binds directly to promoters and associates with lineage-specific chromatin interactions and gene expression. Silencing ZNF143 or modulating its DNA-binding affinity using single nucleotide polymorphisms (SNPs) as a surrogate of site-directed mutagenesis reveals the sequence dependency of chromatin interactions at gene promoters. We also find that chromatin interactions alone do not regulate gene expression. Together, our results identify ZNF143 as a novel chromatin-looping factor that contributes to the architectural foundation of the genome by providing sequence specificity at promoters connected with distal regulatory elements. PMID:25645053

  3. A regulatory gene network related to the porcine umami taste receptor (TAS1R1/TAS1R3).

    PubMed

    Kim, J M; Ren, D; Reverter, A; Roura, E

    2016-02-01

    Taste perception plays an important role in the mediation of food choices in mammals. The first porcine taste receptor genes identified, sequenced and characterized, TAS1R1 and TAS1R3, were related to the dimeric receptor for umami taste. However, little is known about their regulatory network. The objective of this study was to unfold the genetic network involved in porcine umami taste perception. We performed a meta-analysis of 20 gene expression studies spanning 480 porcine microarray chips and screened 328 taste-related genes by selective mining steps among the available 12,320 genes. A porcine umami taste-specific regulatory network was constructed based on the normalized coexpression data of the 328 genes across 27 tissues. From the network, we revealed the 'taste module' and identified a coexpression cluster for the umami taste according to the first connector with the TAS1R1/TAS1R3 genes. Our findings identify several taste-related regulatory genes and extend previous genetic background of porcine umami taste.

  4. Coordinated regulation of biosynthetic and regulatory genes coincides with anthocyanin accumulation in developing eggplant fruit

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Violet to black pigmentation of eggplant (Solanum melongena) fruit is attributed to anthocyanin accumulation. Model systems support the interaction of biosynthetic and regulatory genes for anthocyanin biosynthesis. Anthocyanin structural gene transcription requires the expression of at least one m...

  5. Structure and sequence divergence of two archaebacterial genes.

    PubMed Central

    Cue, D; Beckler, G S; Reeve, J N; Konisky, J

    1985-01-01

    The DNA sequences of a region that includes the hisA gene of two related methanogenic archaebacteria, Methanococcus voltae and Methanococcus vannielii, have been compared. Both organisms show a similar genome organization in this region, displaying three open reading frames (ORFs) separated by regions of very high A + T content. Two of the ORFs, including ORFHisA, show significant DNA sequence homology. As might be expected for organisms having a genome that is A + T-rich, there is a high preference for A and U as the third base in codons. Although the regions upstream of the structural genes contain prokaryotic-like promoter sequences, it is not known whether they are recognized as promoters in these archaebacterial cells. A ribosome binding site, G-G-T-G, is located 6 base pairs preceding the ATG translation initiation sequence of both hisA genes. The sequences upstream of the two hisA genes show only limited sequence homology. The M. voltae intergenic region contains four tandemly arranged repetitions of an 11-base-pair sequence, whereas the M. vannielii sequence contains both direct and inverted repetitive sequences. Based on the degree of hisA sequence homology, we conclude that M. voltae and M. vannielii are less closely related taxonomically than are members of the enteric group of eubacteria. PMID:3923489

  6. The Impact of Gene Expression Variation on the Robustness and Evolvability of a Developmental Gene Regulatory Network

    PubMed Central

    Garfield, David A.; Runcie, Daniel E.; Babbitt, Courtney C.; Haygood, Ralph; Nielsen, William J.; Wray, Gregory A.

    2013-01-01

    Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear), allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role. PMID:24204211

  7. Regulatory component analysis: a semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge.

    PubMed

    Wang, Chen; Xuan, Jianhua; Shih, Ie-Ming; Clarke, Robert; Wang, Yue

    2012-08-01

    With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise-ratio (SNR) is low, but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on E. coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.

  8. An algebra-based method for inferring gene regulatory networks

    PubMed Central

    2014-01-01

    Background The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. Results This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also

  9. Single molecule targeted sequencing for cancer gene mutation detection.

    PubMed

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

    2016-01-01

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis. PMID:27193446

  10. Single molecule targeted sequencing for cancer gene mutation detection

    PubMed Central

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W.; He, Jiankui

    2016-01-01

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis. PMID:27193446

  11. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome

    PubMed Central

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A.

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. PMID:25324314

  12. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

    PubMed

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser.

  13. Using evolutionary computations to understand the design and evolution of gene and cell regulatory networks.

    PubMed

    Spirov, Alexander; Holloway, David

    2013-07-15

    This paper surveys modeling approaches for studying the evolution of gene regulatory networks (GRNs). Modeling of the design or 'wiring' of GRNs has become increasingly common in developmental and medical biology, as a means of quantifying gene-gene interactions, the response to perturbations, and the overall dynamic motifs of networks. Drawing from developments in GRN 'design' modeling, a number of groups are now using simulations to study how GRNs evolve, both for comparative genomics and to uncover general principles of evolutionary processes. Such work can generally be termed evolution in silico. Complementary to these biologically-focused approaches, a now well-established field of computer science is Evolutionary Computations (ECs), in which highly efficient optimization techniques are inspired from evolutionary principles. In surveying biological simulation approaches, we discuss the considerations that must be taken with respect to: (a) the precision and completeness of the data (e.g. are the simulations for very close matches to anatomical data, or are they for more general exploration of evolutionary principles); (b) the level of detail to model (we proceed from 'coarse-grained' evolution of simple gene-gene interactions to 'fine-grained' evolution at the DNA sequence level); (c) to what degree is it important to include the genome's cellular context; and (d) the efficiency of computation. With respect to the latter, we argue that developments in computer science EC offer the means to perform more complete simulation searches, and will lead to more comprehensive biological predictions.

  14. The qa repressor gene of Neurospora crassa: wild-type and mutant nucleotide sequences.

    PubMed Central

    Huiet, L; Giles, N H

    1986-01-01

    The qa-1S gene, one of two regulatory genes in the qa gene cluster of Neurospora crassa, encodes the qa repressor. The qa-1S gene together with the qa-1F gene, which encodes the qa activator protein, control the expression of all seven qa genes, including those encoding the inducible enzymes responsible for the utilization of quinic acid as a carbon source. The nucleotide sequence of the qa-1S gene and its flanking regions has been determined. The deduced coding sequence for the qa-1S protein encodes 918 amino acids with a calculated molecular weight of 100,650 and is interrupted by a single 66-base-pair intervening sequence. Both constitutive and noninducible mutants occur in the qa-1S gene and two different mutations of each type have been cloned and sequenced. All four mutations occur within the predicted coding region of the qa-1S gene. This result strongly supports the hypothesis that the qa-1S gene encodes a repressor. All four mutations are located within codons for the last 300 amino acids of the qa-1S protein. The mutations in three of the mutants involve amino acid substitutions, while the fourth mutant, which has a constitutive phenotype, contains a frameshift mutation. The two constitutive mutations occur in the most distal region of the gene, possibly implicating the COOH-terminal region of the qa repressor in binding to its target. The two noninducible mutations occur in a region proximal to the constitutive mutations, possibly implicating this region of the qa repressor in binding the inducer. Images PMID:3010294

  15. Regulatory region in choline acetyltransferase gene directs developmental and tissue-specific expression in transgenic mice.

    PubMed Central

    Lönnerberg, P; Lendahl, U; Funakoshi, H; Arhlund-Richter, L; Persson, H; Ibáñez, C F

    1995-01-01

    Acetylcholine, one of the main neurotransmitters in the nervous system, is synthesized by the enzyme choline acetyltransferase (ChAT; acetyl-CoA:choline O-acetyltransferase, EC 2.3.1.6). The molecular mechanisms controlling the establishment, maintenance, and plasticity of the cholinergic phenotype in vivo are largely unknown. A previous report showed that a 3800-bp, but not a 1450-bp, 5' flanking segment from the rat ChAT gene promoter directed cell type-specific expression of a reporter gene in cholinergic cells in vitro. Now we have characterized a distal regulatory region of the ChAT gene that confers cholinergic specificity on a heterologous downstream promoter in a cholinergic cell line and in transgenic mice. A 2342-bp segment from the 5' flanking region of the ChAT gene behaved as an enhancer in cholinergic cells but as a repressor in noncholinergic cells in an orientation-independent manner. Combined with a heterologous basal promoter, this fragment targeted transgene expression to several cholinergic regions of the central nervous system of transgenic mice, including basal forebrain, cortex, pons, and spinal cord. In eight independent transgenic lines, the pattern of transgene expression paralleled qualitatively and quantitatively that displayed by endogenous ChAT mRNA in various regions of the rat central nervous system. In the lumbar enlargement of the spinal cord, 85-90% of the transgene expression was targeted to the ventral part of the cord, where cholinergic alpha-motor neurons are located. Transgene expression in the spinal cord was developmentally regulated and responded to nerve injury in a similar way as the endogenous ChAT gene, indicating that the 2342-bp regulatory sequence contains elements controlling the plasticity of the cholinergic phenotype in developing and injured neurons. Images Fig. 1 Fig. 2 PMID:7732028

  16. Rearrangement of Upstream Regulatory Elements Leads to Ectopic Expression of the Drosophila Mulleri Adh-2 Gene

    PubMed Central

    Falb, D.; Fischer, J.; Maniatis, T.

    1992-01-01

    The Adh-2 gene of Drosophila mulleri is expressed in the larval fat body and the adult fat body and hindgut, and a 1500-bp element located 2-3 kb upstream of the Adh-2 promoter is necessary for maximal levels of transcription. Previous work demonstrated that deletion of sequences between this upstream element and the Adh-2 promoter results in Adh-2 gene expression in a novel larval tissue, the middle midgut. In this study we show that the upstream element possesses all of the characteristics of a transcriptional enhancer: its activity is independent of orientation, it acts on a heterologous promoter, and it functions at various positions both 5' and 3' to the Adh-2 gene. Full enhancer function can be localized to a 750-bp element, although other regions possess some redundant activity. The ectopic expression pattern is dependent on the proximity of at least two sequence elements. Thus, tissue-specific transcription can involve complex proximity-dependent interactions among combinations of regulatory elements. PMID:1459428

  17. Molecular cloning, nucleotide sequence and expression of a Sulfolobus solfataricus gene encoding a class II fumarase.

    PubMed

    Colombo, S; Grisa, M; Tortora, P; Vanoni, M

    1994-01-01

    Fumarase catalyzes the interconversion of L-malate and fumarate. A Sulfolobus solfataricus fumarase gene (fumC) was cloned and sequenced. Typical archaebacterial regulatory sites were identified in the region flanking the fumC open reading frame. The fumC gene encodes a protein of 438 amino acids (47,899 Da) which shows several significant similarities with class II fumarases from both eubacterial and eukariotic sources as well as with aspartases. S. solfataricus fumarase expressed in Escherichia coli retains enzymatic activity and its thermostability is comparable to that of S. solfataricus purified enzyme despite a 11 amino acid C-terminal deletion.

  18. Gene Regulatory Scenarios of Primary 1,25-Dihydroxyvitamin D3 Target Genes in a Human Myeloid Leukemia Cell Line

    PubMed Central

    Ryynänen, Jussi; Seuter, Sabine; Campbell, Moray J.; Carlberg, Carsten

    2013-01-01

    Genome- and transcriptome-wide data has significantly increased the amount of available information about primary 1,25-dihydroxyvitamin D3 (1,25(OH)2D3) target genes in cancer cell models, such as human THP-1 myelomonocytic leukemia cells. In this study, we investigated the genes G0S2, CDKN1A and MYC as master examples of primary vitamin D receptor (VDR) targets being involved in the control of cellular proliferation. The chromosomal domains of G0S2 and CDKN1A are 140–170 kb in size and contain one and three VDR binding sites, respectively. This is rather compact compared to the MYC locus that is 15 times larger and accommodates four VDR binding sites. All eight VDR binding sites were studied by chromatin immunoprecipitation in THP-1 cells. Interestingly, the site closest to the transcription start site of the down-regulated MYC gene showed 1,25(OH)2D3-dependent reduction of VDR binding and is not associated with open chromatin. Four of the other seven VDR binding regions contain a typical DR3-type VDR binding sequence, three of which are also occupied with VDR in macrophage-like cells. In conclusion, the three examples suggest that each VDR target gene has an individual regulatory scenario. However, some general components of these scenarios may be useful for the development of new therapy regimens. PMID:24202443

  19. Evolution of gene sequence in response to chromosomal location.

    PubMed

    Díaz-Castillo, Carlos; Golic, Kent G

    2007-09-01

    Evolutionary forces acting on the repetitive DNA of heterochromatin are not constrained by the same considerations that apply to protein-coding genes. Consequently, such sequences are subject to rapid evolutionary change. By examining the Troponin C gene family of Drosophila melanogaster, which has euchromatic and heterochromatic members, we find that protein-coding genes also evolve in response to their chromosomal location. The heterochromatic members of the family show a reduced CG content and increased variation in DNA sequence. We show that the CG reduction applies broadly to the protein-coding sequences of genes located at the heterochromatin:euchromatin interface, with a very strong correlation between CG content and the distance from centric heterochromatin. We also observe a similar trend in the transition from telomeric heterochromatin to euchromatin. We propose that the methylation of DNA is one of the forces driving this sequence evolution.

  20. Integrative analysis of time course microarray data and DNA sequence data via log-linear models for identifying dynamic transcriptional regulatory networks.

    PubMed

    Choi, Hyung-Seok; Kim, Youngchul; Cho, Kwang-Hyun; Park, Taesung

    2013-01-01

    Since eukaryotic transcription is regulated by sets of Transcription Factors (TFs) having various transcriptional time delays, identification of temporal combinations of activated TFs is important to reconstruct Transcriptional Regulatory Networks (TRNs). Our methods combine time course microarray data, information on physical binding between the TFs and their targets and the regulatory sequences of genes using a log-linear model to reconstruct dynamic functional TRNs of the yeast cell cycle and human apoptosis. In conclusion, our results suggest that the proposed dynamic motif search method is more effective in reconstructing TRNs than the static motif search method.

  1. The myxoma virus thymidine kinase gene: sequence and transcriptional mapping.

    PubMed

    Jackson, R J; Bults, H G

    1992-02-01

    The myxoma virus thymidine kinase (TK) gene is encoded on a 1.6 kb SacI-SalI restriction fragment located between 57.7 and 59.3 kb on the 163 kb genomic map. The nucleotide sequence of this fragment as well as 228 bp from the adjacent SalI-AA2 fragment was determined and found to encode four major open reading frames (ORFs). Three of these ORFs are similar in nucleotide sequence to ORFs L5R and J1R, and the TK gene of vaccinia virus (VV). The fourth ORF, MF8a, shows similarity to the ORFs found in the same position relative to the TK genes of Shope fibroma virus, Kenya sheep-1 virus and swine-pox virus. A search of the complete VV nucleotide sequence for regions of similarity to MF8a identified the host specificity gene C7L. Northern blot analysis of early viral RNA identified transcripts of approximately 700 nucleotides for both the TK gene and ORF MF8a. The 5' ends of the TK gene and ORF MF8a early mRNAs were mapped by primer extension to initiation sites 13 nucleotides downstream of sequences with similarity to the VV early promoter consensus. The sizes of the TK and MF8a mRNAs are consistent with transcription termination and polyadenylation occurring downstream of the sequence TTTTTNT, which is identical to the consensus sequence for the VV transcription termination signal.

  2. Flagellin gene sequence variation in the genus Pseudomonas.

    PubMed

    Bellingham, N F; Morgan, J A; Saunders, J R; Winstanley, C

    2001-07-01

    Flagellin gene (fliC) sequences from 18 strains of Pseudomonas sensu stricto representing 8 different species, and 9 representative fliC sequences from other members of the gamma sub-division of proteobacteria, were compared. Analysis was performed on N-terminal, C-terminal and whole fliC sequences. The fliC analyses confirmed the inferred relationship between P. mendocina, P. oleovorans and P. aeruginosa based on 16S rRNA sequence comparisons. In addition, the analyses indicated that P. putida PRS2000 was closely related to P. fluorescens SBW25 and P. fluorescens NCIMB 9046T, but suggested that P. putida PaW8 and P. putida PRS2000 were more closely related to other Pseudomonas spp. than they were to each other. There were a number of inconsistencies in inferred evolutionary relationships between strains, depending on the analysis performed. In particular, whole flagellin gene comparisons often differed from those obtained using N- and C-terminal sequences. However, there were also inconsistencies between the terminal region analyses, suggesting that phylogenetic relationships inferred on the basis of fliC sequence should be treated with caution. Although the central domain of fliC is highly variable between Pseudomonas strains, there was evidence of sequence similarities between the central domains of different Pseudomonas fliC sequences. This indicates the possibility of recombination in the central domain of fliC genes within Pseudomonas species, and between these genes and those from other bacteria. PMID:11518318

  3. Nucleotide sequence analysis of a candidate gene for ataxia-telangiectasia group D (ATDC)

    SciTech Connect

    Leonhardt, E.A.; Kapp, L.N.; Young, B.R.; Murnane, J.P. )

    1994-01-01

    A radioresistant cell clone (1B3) was previously isolated after transfection of an ataxia-telangiectasia (AT) group D cell line with a human cosmid library. A cosmid rescued from the integration site in 1B3 contained human DNA from chromosome position 11q23, the same region shown by both genetic linkage and chromosome transfer to contain the genes for AT complementation groups A/B, C, and D. A gene within the cosmid (ATDC) was found to produce mRNAs of different sizes. A cDNA for one of the most abundant mRNAs (3.0 kb) was isolated from a HeLa cell library. In the present study, the authors sequenced the 3.0-kb cDNA and the surrounding intron DNA in the cosmids. They used polymerase chain reaction, with primers in the introns, to confirm the number of exons and to analyze DNA from AT group D cells for mutations within this gene. Although no mutations were found, they do not rule out the possibility that mutations may be present within the regulatory sequences or coding sequences found in other mRNAs specific for this gene. From the sequence analysis, they found that the ATDC gene product is one of a group of proteins that share multiple zinc finger motifs and an adjacent leucine zipper motif. These proteins have been proposed to form homo- or hetero-dimers involved in nucleic acid binding, consistent with the fact that many of these proteins appear to be transcriptional regulatory factors involved in carcinogenesis and/or differentiation. The likelihood that the ATDC gene product is involved in transcriptional regulation could explain the pleiomorphic characteristics of AT, including abnormal cell cycle regulation. 36 refs., 5 figs., 2 tabs.

  4. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  5. Clinical characteristics and prognosis of acute myeloid leukemia associated with DNA-methylation regulatory gene mutations.

    PubMed

    Ryotokuji, Takeshi; Yamaguchi, Hiroki; Ueki, Toshimitsu; Usuki, Kensuke; Kurosawa, Saiko; Kobayashi, Yutaka; Kawata, Eri; Tajika, Kenji; Gomi, Seiji; Kanda, Junya; Kobayashi, Anna; Omori, Ikuko; Marumo, Atsushi; Fujiwara, Yusuke; Yui, Shunsuke; Terada, Kazuki; Fukunaga, Keiko; Hirakawa, Tsuneaki; Arai, Kunihito; Kitano, Tomoaki; Kosaka, Fumiko; Tamai, Hayato; Nakayama, Kazutaka; Wakita, Satoshi; Fukuda, Takahiro; Inokuchi, Koiti

    2016-09-01

    In recent years, it has been reported that the frequency of DNA-methylation regulatory gene mutations - mutations of the genes that regulate gene expression through DNA methylation - is high in acute myeloid leukemia. The objective of the present study was to elucidate the clinical characteristics and prognosis of acute myeloid leukemia with associated DNA-methylation regulatory gene mutation. We studied 308 patients with acute myeloid leukemia. DNA-methylation regulatory gene mutations were observed in 135 of the 308 cases (43.8%). Acute myeloid leukemia associated with a DNA-methylation regulatory gene mutation was more frequent in older patients (P<0.0001) and in patients with intermediate cytogenetic risk (P<0.0001) accompanied by a high white blood cell count (P=0.0032). DNA-methylation regulatory gene mutation was an unfavorable prognostic factor for overall survival in the whole cohort (P=0.0018), in patients aged ≤70 years, in patients with intermediate cytogenetic risk, and in FLT3-ITD-negative patients (P=0.0409). Among the patients with DNA-methylation regulatory gene mutations, 26.7% were found to have two or more such mutations and prognosis worsened with increasing number of mutations. In multivariate analysis DNA-methylation regulatory gene mutation was an independent unfavorable prognostic factor for overall survival (P=0.0424). However, patients with a DNA-methylation regulatory gene mutation who underwent allogeneic stem cell transplantation in first remission had a significantly better prognosis than those who did not undergo such transplantation (P=0.0254). Our study establishes that DNA-methylation regulatory gene mutation is an important unfavorable prognostic factor in acute myeloid leukemia.

  6. Clinical characteristics and prognosis of acute myeloid leukemia associated with DNA-methylation regulatory gene mutations

    PubMed Central

    Ryotokuji, Takeshi; Yamaguchi, Hiroki; Ueki, Toshimitsu; Usuki, Kensuke; Kurosawa, Saiko; Kobayashi, Yutaka; Kawata, Eri; Tajika, Kenji; Gomi, Seiji; Kanda, Junya; Kobayashi, Anna; Omori, Ikuko; Marumo, Atsushi; Fujiwara, Yusuke; Yui, Shunsuke; Terada, Kazuki; Fukunaga, Keiko; Hirakawa, Tsuneaki; Arai, Kunihito; Kitano, Tomoaki; Kosaka, Fumiko; Tamai, Hayato; Nakayama, Kazutaka; Wakita, Satoshi; Fukuda, Takahiro; Inokuchi, Koiti

    2016-01-01

    In recent years, it has been reported that the frequency of DNA-methylation regulatory gene mutations – mutations of the genes that regulate gene expression through DNA methylation – is high in acute myeloid leukemia. The objective of the present study was to elucidate the clinical characteristics and prognosis of acute myeloid leukemia with associated DNA-methylation regulatory gene mutation. We studied 308 patients with acute myeloid leukemia. DNA-methylation regulatory gene mutations were observed in 135 of the 308 cases (43.8%). Acute myeloid leukemia associated with a DNA-methylation regulatory gene mutation was more frequent in older patients (P<0.0001) and in patients with intermediate cytogenetic risk (P<0.0001) accompanied by a high white blood cell count (P=0.0032). DNA-methylation regulatory gene mutation was an unfavorable prognostic factor for overall survival in the whole cohort (P=0.0018), in patients aged ≤70 years, in patients with intermediate cytogenetic risk, and in FLT3-ITD-negative patients (P=0.0409). Among the patients with DNA-methylation regulatory gene mutations, 26.7% were found to have two or more such mutations and prognosis worsened with increasing number of mutations. In multivariate analysis DNA-methylation regulatory gene mutation was an independent unfavorable prognostic factor for overall survival (P=0.0424). However, patients with a DNA-methylation regulatory gene mutation who underwent allogeneic stem cell transplantation in first remission had a significantly better prognosis than those who did not undergo such transplantation (P=0.0254). Our study establishes that DNA-methylation regulatory gene mutation is an important unfavorable prognostic factor in acute myeloid leukemia. PMID:27247325

  7. Signaling Pathways and Gene Regulatory Networks in Cardiomyocyte Differentiation

    PubMed Central

    Parikh, Abhirath; Wu, Jincheng; Blanton, Robert M.

    2015-01-01

    Strategies for harnessing stem cells as a source to treat cell loss in heart disease are the subject of intense research. Human pluripotent stem cells (hPSCs) can be expanded extensively in vitro and therefore can potentially provide sufficient quantities of patient-specific differentiated cardiomyocytes. Although multiple stimuli direct heart development, the differentiation process is driven in large part by signaling activity. The engineering of hPSCs to heart cell progeny has extensively relied on establishing proper combinations of soluble signals, which target genetic programs thereby inducing cardiomyocyte specification. Pertinent differentiation strategies have relied as a template on the development of embryonic heart in multiple model organisms. Here, information on the regulation of cardiomyocyte development from in vivo genetic and embryological studies is critically reviewed. A fresh interpretation is provided of in vivo and in vitro data on signaling pathways and gene regulatory networks (GRNs) underlying cardiopoiesis. The state-of-the-art understanding of signaling pathways and GRNs presented here can inform the design and optimization of methods for the engineering of tissues for heart therapies. PMID:25813860

  8. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

    PubMed Central

    O'Connor, Timothy R.; Bailey, Timothy L.

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088

  9. Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.

    PubMed Central

    Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G

    1993-01-01

    The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231

  10. Structure and sequence divergence of two archaebacterial genes

    SciTech Connect

    Cue, D.; Beckler, G.S.; Reeve, J.N.; Konisky, J.

    1985-06-01

    The DNA sequences of a region that includes the hisA gene of two related methanogenic archaebacteria, Methanococcus voltae and Methanococcus vannielii, have been compared. Both organisms show a similar genome organization in this region, displaying three open reading frames (ORFs) separated by regions of very high A+T content. Two of the ORFs, including ORFHisA, show significant DNA sequence homology. As might be expected for organisms having a genome that is A+T-rich, there is a high preference for A and U as the third base in codons. A ribosome binding site, G-G-T-G, is located 6 base pairs preceding the ATG translation initiation sequence of both hisA genes. The sequences upstream of the two hisA genes show only limited sequence homology. The M. voltae intergenic region contains four tandemly arranged repetitions of an 11-base-pair sequence, whereas the M. vannielii sequence contains both direct and inverted repetitive sequences. Based on the degree of hisA sequence homology, the authors conclude that M. voltae and M. vannielii are less closely related taxonomically than are members of the enteric group of eubacteria.

  11. Enhancer sequence variants and transcription-factor deregulation synergize to construct pathogenic regulatory circuits in B-cell lymphoma.

    PubMed

    Koues, Olivia I; Kowalewski, Rodney A; Chang, Li-Wei; Pyfrom, Sarah C; Schmidt, Jennifer A; Luo, Hong; Sandoval, Luis E; Hughes, Tyler B; Bednarski, Jeffrey J; Cashen, Amanda F; Payton, Jacqueline E; Oltz, Eugene M

    2015-01-20

    Most B-cell lymphomas arise in the germinal center (GC), where humoral immune responses evolve from potentially oncogenic cycles of mutation, proliferation, and clonal selection. Although lymphoma gene expression diverges significantly from GC B cells, underlying mechanisms that alter the activities of corresponding regulatory elements (REs) remain elusive. Here we define the complete pathogenic circuitry of human follicular lymphoma (FL), which activates or decommissions REs from normal GC B cells and commandeers enhancers from other lineages. Moreover, independent sets of transcription factors, whose expression was deregulated in FL, targeted commandeered versus decommissioned REs. Our approach revealed two distinct subtypes of low-grade FL, whose pathogenic circuitries resembled GC B or activated B cells. FL-altered enhancers also were enriched for sequence variants, including somatic mutations, which disrupt transcription-factor binding and expression of circuit-linked genes. Thus, the pathogenic regulatory circuitry of FL reveals distinct genetic and epigenetic etiologies for GC B-cell transformation.

  12. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model.

    PubMed

    Smith, Robin P; Taher, Leila; Patwardhan, Rupali P; Kim, Mee J; Inoue, Fumitaka; Shendure, Jay; Ovcharenko, Ivan; Ahituv, Nadav

    2013-09-01

    Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers. PMID:23892608

  13. Inference of gene regulatory subnetworks from time course gene expression data

    PubMed Central

    2012-01-01

    Background Identifying gene regulatory network (GRN) from time course gene expression data has attracted more and more attentions. Due to the computational complexity, most approaches for GRN reconstruction are limited on a small number of genes and low connectivity of the underlying networks. These approaches can only identify a single network for a given set of genes. However, for a large-scale gene network, there might exist multiple potential sub-networks, in which genes are only functionally related to others in the sub-networks. Results We propose the network and community identification (NCI) method for identifying multiple subnetworks from gene expression data by incorporating community structure information into GRN inference. The proposed algorithm iteratively solves two optimization problems, and can promisingly be applied to large-scale GRNs. Furthermore, we present the efficient Block PCA method for searching communities in GRNs. Conclusions The NCI method is effective in identifying multiple subnetworks in a large-scale GRN. With the splitting algorithm, the Block PCA method shows a promosing attempt for exploring communities in a large-scale GRN. PMID:22901088

  14. Evading the annotation bottleneck: using sequence similarity to search non-sequence gene data

    PubMed Central

    Gilchrist, Michael J; Christensen, Mikkel B; Harland, Richard; Pollet, Nicolas; Smith, James C; Ueno, Naoto; Papalopulu, Nancy

    2008-01-01

    Background Non-sequence gene data (images, literature, etc.) can be found in many different public databases. Access to these data is mostly by text based methods using gene names; however, gene annotation is neither complete, nor fully systematic between organisms, and is also not generally stable over time. This provides some challenges for text based access, especially for cross-species searches. We propose a method for non-sequence data retrieval based on sequence similarity, which removes dependence on annotation and text searches. This work was motivated by the need to provide better access to large numbers of in situ images, and the observation that such image data were usually associated with a specific gene sequence. Sequence similarity searches are found in existing gene oriented databases, but mostly give indirect access to non-sequence data via navigational links. Results Three applications were built to explore the proposed method: accessing image data, literature and gene names. Searches are initiated with the sequence of the user's gene of interest, which is searched against a database of sequences associated with the target data. The matching (non-sequence) target data are returned directly to the user's browser, organised by sequence similarity. The method worked well for the intended application in image data management. Comparison with text based searches of the image data set showed the accuracy of the method. Applied to literature searches it facilitated retrieval of mostly high relevance references. Applied to gene name data it provided a useful analysis of name variation of related genes within and between species. Conclusion This method makes a powerful and useful addition to existing methods for searching gene data based on text retrieval or curated gene lists. In particular the method facilitates cross-species comparisons, and enables the handling of novel or otherwise un-annotated genes. Applications using the method are quick and easy to

  15. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  16. Mechanism of Gene Amplification via Yeast Autonomously Replicating Sequences

    PubMed Central

    Dhar, M. K.

    2015-01-01

    The present investigation was aimed at understanding the molecular mechanism of gene amplification. Interplay of fragile sites in promoting gene amplification was also elucidated. The amplification promoting sequences were chosen from the Saccharomyces cerevisiae ARS, 5S rRNA regions of Plantago ovata and P. lagopus, proposed sites of replication pausing at Ste20 gene locus of S. cerevisiae, and the bend DNA sequences within fragile site FRA11A in humans. The gene amplification assays showed that plasmid bearing APS from yeast and human beings led to enhanced protein concentration as compared to the wild type. Both the in silico and in vitro analyses were pointed out at the strong bending potential of these APS. In addition, high mitotic stability and presence of TTTT repeats and SAR amongst these sequences encourage gene amplification. Phylogenetic analysis of S. cerevisiae ARS was also conducted. The combinatorial power of different aspects of APS analyzed in the present investigation was harnessed to reach a consensus about the factors which stimulate gene expression, in presence of these sequences. It was concluded that the mechanism of gene amplification was that AT rich tracts present in fragile sites of yeast serve as binding sites for MAR/SAR and DNA unwinding elements. The DNA protein interactions necessary for ORC activation are facilitated by DNA bending. These specific bindings at ORC promote repeated rounds of DNA replication leading to gene amplification. PMID:25685838

  17. Gene annotation and functional analysis of a newly sequenced Synechococcus strain.

    PubMed

    Li, Y; Rao, N N; Yang, Y; Zhang, Y; Gu, Y N

    2015-10-16

    Synechococcus sp PCC 7336 represents a newly sequenced strain, and its genome is obviously different from that of other Synechococcus strains. In this analysis, local alignment and annotation databases were constructed and combined with various bioinformatic tools to carry out gene annotation and functional analysis of this strain. From this analysis, we identified 5096 protein-coding genes and 47 RNA genes. Of these, 116 genes that were classified into 9 categories were associated with photosynthesis, and type V polymerase proteins that were identified are unique for this strain. An additional 107 genes were closely related to signal transduction pathways, which primarily comprised parts of two-component regulatory systems. Gene ontogeny analysis showed that 2377 genes were annotated with a total number of 9791 functional categories, and specifically that 41 genes distributed in 4 protein complexes were involved in oxidative phosphorylation. Clusters of orthologous groups classification showed that there were 1463 homologous proteins associated with 17 specific metabolic pathways, and that most of the proteins participated in primary metabolic processes such as binding and catalysis. The phylogenetic tree based on 16S rRNA sequences indicated that Synechococcus PCC 7336 is highly likely to represent a new branch.

  18. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  19. Evolution of chorion structural genes and regulatory mechanisms in two wild silkmoths: a preliminary analysis.

    PubMed

    Moschonas, N K; Thireos, G; Kafatos, F C

    1988-01-01

    We report a preliminary analysis of structural and regulatory evolution of the A and B chorion gene families in two wild silkmoths, Antheraea pernyi and Antheraea polyphemus. Homospecific and heterospecific dot hybridizations were performed between previously characterized A. polyphemus complementary DNA clones and total or stage-specific follicular mRNAs from the two species. The hybridization patterns indicated substantial interspecies changes in the abundance of corresponding mRNA sequences (heteroposic evolution) without substantial changes in their developmental specificities (heterochronic evolution). In addition, the proteins encoded in the two species by corresponding mRNAs were determined by hybrid-selected translation followed by electrophoretic analysis. The results suggested that the proteins evolve in size, presumably through internal deletions and duplications.

  20. Identification of cis-acting repressive sequences within the negative regulatory element of human immunodeficiency virus type 1.

    PubMed Central

    Lu, Y C; Touzjian, N; Stenzel, M; Dorfman, T; Sodroski, J G; Haseltine, W A

    1990-01-01

    The negative regulatory element of human immunodeficiency virus type 1 is a 260-nucleotide-long sequence that decreases the rate of RNA transcription initiation specified by the long terminal repeat. This region has the potential to bind several cellular transcription factors. Here it is shown that sequences which recognize the NFAT-1 and USF cellular transcription factors contribute to this negative regulatory effect. The sequences within the negative regulatory element which resemble the AP-1 site and the URS do not negatively regulate human immunodeficiency virus long terminal repeat transcription initiation. PMID:2398545

  1. cis-Regulatory control of the initial neurogenic pattern of onecut gene expression in the sea urchin embryo.

    PubMed

    Barsi, Julius C; Davidson, Eric H

    2016-01-01

    Specification of the ciliated band (CB) of echinoid embryos executes three spatial functions essential for postgastrular organization. These are establishment of a band about 5 cells wide which delimits and bounds other embryonic territories; definition of a neurogenic domain within this band; and generation within it of arrays of ciliary cells that bear the special long cilia from which the structure derives its name. In Strongylocentrotus purpuratus the spatial coordinates of the future ciliated band are initially and exactly determined by the disposition of a ring of cells that transcriptionally activate the onecut homeodomain regulatory gene, beginning in blastula stage, long before the appearance of the CB per se. Thus the cis-regulatory apparatus that governs onecut expression in the blastula directly reveals the genomic sequence code by which these aspects of the spatial organization of the embryo are initially determined. We screened the entire onecut locus and its flanking region for transcriptionally active cis-regulatory elements, and by means of BAC recombineered deletions identified three separated and required cis-regulatory modules that execute different functions. The operating logic of the crucial spatial control module accounting for the spectacularly precise and beautiful early onecut expression domain depends on spatial repression. Previously predicted oral ectoderm and aboral ectoderm repressors were identified by cis-regulatory mutation as the products of goosecoid and irxa genes respectively, while the pan-ectodermal activator SoxB1 supplies a transcriptional driver function.

  2. The evolutionary origination and diversification of a dimorphic gene regulatory network through parallel innovations in cis and trans.

    PubMed

    Camino, Eric M; Butts, John C; Ordway, Alison; Vellky, Jordan E; Rebeiz, Mark; Williams, Thomas M

    2015-04-01

    The origination and diversification of morphological characteristics represents a key problem in understanding the evolution of development. Morphological traits result from gene regulatory networks (GRNs) that form a web of transcription factors, which regulate multiple cis-regulatory element (CRE) sequences to control the coordinated expression of differentiation genes. The formation and modification of GRNs must ultimately be understood at the level of individual regulatory linkages (i.e., transcription factor binding sites within CREs) that constitute the network. Here, we investigate how elements within a network originated and diversified to generate a broad range of abdominal pigmentation phenotypes among Sophophora fruit flies. Our data indicates that the coordinated expression of two melanin synthesis enzymes, Yellow and Tan, recently evolved through novel CRE activities that respond to the spatial patterning inputs of Hox proteins and the sex-specific input of Bric-à-brac transcription factors. Once established, it seems that these newly evolved activities were repeatedly modified by evolutionary changes in the network's trans-regulators to generate large-scale changes in pigment pattern. By elucidating how yellow and tan are connected to the web of abdominal trans-regulators, we discovered that the yellow and tan abdominal CREs are composed of distinct regulatory inputs that exhibit contrasting responses to the same Hox proteins and Hox cofactors. These results provide an example in which CRE origination underlies a recently evolved novel trait, and highlights how coordinated expression patterns can evolve in parallel through the generation of unique regulatory linkages.

  3. Sequencing, genomic organization, and preliminary promoter analysis of a black cherry (R)-(+)-mandelonitrile lyase gene.

    PubMed

    Hu, Z; Poulton, J E

    1997-12-01

    The flavoprotein (R)-(+)-mandelonitrile lyase (MDL; EC 4.1.2.10) plays a key role in cyanogenesis in rosaceous stone fruits. An MDL gene (mdl3) and its corresponding cDNA (MDL3) were isolated from black cherry (Prunus serotina) and characterized. The mdl3 gene contains 2292 bp of the 5' flanking region, the entire coding region, and 300 bp of the 3' flanking region. The coding region is interrupted by three short introns, of which one possesses the usual GC-AG splice junction dinucleotides. This gene encodes a polypeptide of 573 amino acids that includes a putative signal sequence, 13 potential N-glycosylation sites, and a presumptive flavin adenine dinucleotide-binding site. To determine whether the 5' flanking region of the mdl3 gene is capable of driving MDL expression, it was fused to the beta-glucuronidase reporter gene for Agrobacterium-mediated transformation into tobacco. Matching endogenous MDL expression patterns, beta-glucuronidase staining was observed in maturing embryos and seeds; it also occurred in postembryonic tissues, especially in association with vascular tissues. After developing a homologous transient transformation system to facilitate identification of putative regulatory sequences, we demonstrated that 125 bp (-107 to +18) of the 5' flanking sequence of the mdl3 gene is sufficient for MDL expression in protoplasts derived from immature black cherry embryos. PMID:9414550

  4. Comparison of the aflR gene sequences of strains in Aspergillus section Flavi.

    PubMed

    Lee, Chao-Zong; Liou, Guey-Yuh; Yuan, Gwo-Fang

    2006-01-01

    Aflatoxins are polyketide-derived secondary metabolites produced by Aspergillus parasiticus, Aspergillus flavus, Aspergillus nomius and a few other species. The toxic effects of aflatoxins have adverse consequences for human health and agricultural economics. The aflR gene, a regulatory gene for aflatoxin biosynthesis, encodes a protein containing a zinc-finger DNA-binding motif. Although Aspergillus oryzae and Aspergillus sojae, which are used in fermented foods and in ingredient manufacture, have no record of producing aflatoxin, they have been shown to possess an aflR gene. This study examined 34 strains of Aspergillus section Flavi. The aflR gene of 23 of these strains was successfully amplified and sequenced. No aflR PCR products were found in five A. sojae strains or six strains of A. oryzae. These PCR results suggested that the aflR gene is absent or significantly different in some A. sojae and A. oryzae strains. The sequenced aflR genes from the 23 positive strains had greater than 96.6 % similarity, which was particularly conserved in the zinc-finger DNA-binding domain. The aflR gene of A. sojae has two obvious characteristics: an extra CTCATG sequence fragment and a C to T transition that causes premature termination of AFLR protein synthesis. Differences between A. parasiticus/A. sojae and A. flavus/A. oryzae aflR genes were also identified. Some strains of A. flavus as well as A. flavus var. viridis, A. oryzae var. viridis and A. oryzae var. effuses have an A. oryzae-type aflR gene. For all strains with the A. oryzae-type aflR gene, there was no evidence of aflatoxin production. It is suggested that for safety reasons, the aflR gene could be examined to assess possible aflatoxin production by Aspergillus section Flavi strains.

  5. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  6. Honey bee promoter sequences for targeted gene expression.

    PubMed

    Schulte, C; Leboulle, G; Otte, M; Grünewald, B; Gehne, N; Beye, M

    2013-08-01

    The honey bee, Apis mellifera, displays a rich behavioural repertoire, social organization and caste differentiation, and has an interesting mode of sex determination, but we still know little about its underlying genetic programs. We lack stable transgenic tools in honey bees that would allow genetic control of gene activity in stable transgenic lines. As an initial step towards a transgenic method, we identified promoter sequences in the honey bee that can drive constitutive, tissue-specific and cold shock-induced gene expression. We identified the promoter sequences of Am-actin5c, elp2l, Am-hsp83 and Am-hsp70 and showed that, except for the elp2l sequence, the identified sequences were able to drive reporter gene expression in Sf21 cells. We further demonstrated through electroporation experiments that the putative neuron-specific elp2l promoter sequence can direct gene expression in the honey bee brain. The identification of these promoter sequences is an important initial step in studying the function of genes with transgenic experiments in the honey bee, an organism with a rich set of interesting phenotypes. PMID:23668189

  7. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes.

  8. Repression of the murine interferon alpha 11 gene: identification of negatively acting sequences.

    PubMed Central

    Civas, A; Dion, M; Vodjdani, G; Doly, J

    1991-01-01

    The uninducible murine interferon alpha 11 gene (Mu IFN-alpha 11) shows strong homology with the highly inducible Mu IFN-alpha 4 gene in the promoter region. Negative regulatory sequences located between positions -470 and -145 were characterized in the Mu IFN-alpha 11 promoter. The removal of these sequences leads to virus-inducibility of Mu IFN-alpha 11 while their insertion in Mu IFN-alpha 4 corresponding region significantly reduced the inducibility of Mu IFN-alpha 4 promoter. On the other hand, the virus-responsive element (VRE) of the Mu IFN-alpha 11 differs by a single nucleotide substitution at position -78 from the VRE alpha 4. Constructions carrying either VRE alpha 11 or VRE alpha 4 upstream a heterologous promoter displayed different virus inducibilities. The -78 A/G substitution affects the inducibility by decreasing the affinity of VRE-binding trans-regulators. Our results suggest that the combined effect of the negative regulatory sequences and of the mutation in the VRE alpha 11, completely silences the Mu IFN-alpha 11 gene. PMID:1886773

  9. Cloning and sequencing of the gene for human. beta. -casein

    SciTech Connect

    Loennerdal, B.; Bergstroem, S.; Andersson, Y.; Hialmarsson, K.; Sundgyist, A.; Hernell, O. )

    1990-02-26

    Human {beta}-casein is a major protein in human milk. This protein is part of the casein micelle and has been suggested to have several physiological functions in the newborn. Since there is limited information on {beta}casein and the factors that affect its concentration in human milk, the authors have isolated and sequenced the gene for this protein. A human mammary gland cDNA library (Clontech) in gt 11 was screened by plaque hy-hybridization using a 42-mer synthetic {sup 32}p-labelled oligo-nucleotide. Positive clones were identified and isolated, DNA was prepared and the gene isolated by cleavage with EcoR1. Following subcloning (PUC18), restriction mapping and Southern blotting, DNA for sequencing was prepared. The gene was sequenced by the dideoxy method. Human {beta}-casein has 212 amino acids and the amino acid sequence deducted from the nucleotide sequence is to 91% identical to the published sequence for human {beta}-casein show a high degree of conservation at the leader peptide and the highly phosphorylated sequences, but also deletions and divergence at several positions. These results provide insight into the structure of the human {beta}-casein gene and will facilitate studies on factors affecting its expression.

  10. Structure and sequence of the gene encoding human keratocan.

    PubMed

    Tasheva, E S; Funderburgh, J L; Funderburgh, M L; Corpuz, L M; Conrad, G W

    1999-01-01

    Keratocan is one of the three major keratan sulfate proteoglycans characteristically expressed in cornea. We have isolated cDNA and genomic clones and determined the sequence of the entire human keratocan (Kera) gene. The gene is spread over 7.65 kb of DNA and contains three exons. An open reading frame starting at the beginning of the second exon encodes a protein of 352 aa. The amino acid sequence of keratocan shows high identity among mammalian species. This evolutionary conservation between the keratocan proteins as well as the restricted expression of Kera gene in cornea suggests that this molecule might be important in developing and maintaining corneal transparency.

  11. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    PubMed

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  12. SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

    NASA Astrophysics Data System (ADS)

    Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

    2015-09-01

    The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.

  13. Analysis of simple sequence repeats in mammalian cell cycle genes.

    PubMed

    Trivedi, Seema; Wills, Christopher; Metzgar, David

    2014-01-01

    Simple sequence repeats (SSRs), or microsatellites are hyper-mutable and can lead to disorders. Here we explore SSR distribution in cell cycle-associated genes [grouped into: checkpoint; regulation; replication, repair, and recombination (RRR); and transition] in humans and orthologues of eight mammals. Among the gene groups studied, transition genes have the highest SSR density. Trinucleotide repeats are not abundant and introns have higher repeat density than exons. Many repeats in human genes are conserved; however, CG motifs are conserved only in regulation genes. SSR variability in cell cycle genes represents a genetic Achilles' heel, yet SSRs are common in all groups of genes. This tolerance many be due to i) positions in introns where they do not disrupt gene function, ii) essential roles in regulation, iii) specific value of adaptability, and/or iv) lack of negative selection pressure. Present study may be useful for further exploration of their medical relevance and potential functionality.

  14. Data on meq gene sequence analysis of Ludhiana MDV isolates.

    PubMed

    Gupta, Mridula; Deka, Dipak; Ramneek

    2016-12-01

    The data described are related to the article entitled "Sequence Analysis of Meq oncogene among Indian isolates of Marek׳s Disease Herpesvirus" M. Gupta, D. Deka, Ramneek, 2016. Seven meq genes of Ludhiana Marek׳s disease virus (MDV) field isolates were PCR amplified by using proof reading Platinum Pfx DNA polymerase enzyme, sequenced and then analyzed for the distinct polymorphisms and point mutations. The sequences were named as LDH 1758, LDH 2003, LDH 2483, LDH 2614, LDH 2700, LDH 2929 and LDH 3262. At this point, their deduced Meq amino acid sequences were compared with GenBank available already sequenced meq genes worldwide in their deduced amino acid form to study their identity/similarity with each other. PMID:27656677

  15. Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids

    PubMed Central

    Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

    2015-01-01

    Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221

  16. Biased distribution of DNA uptake sequences towards genome maintenance genes.

    PubMed

    Davidsen, Tonje; Rødland, Einar A; Lagesen, Karin; Seeberg, Erling; Rognes, Torbjørn; Tønjum, Tone

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H.influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions. These results imply that the high frequency of DUS in genome maintenance genes is conserved among phylogenetically divergent species and thus are of significant biological importance. Increased DUS density is expected to enhance DNA uptake and the over-representation of DUS in genome maintenance genes might reflect facilitated recovery of genome preserving functions. For example, transient and beneficial increase in genome instability can be allowed during pathogenesis simply through loss of antimutator genes, since these DUS-containing sequences will be preferentially recovered. Furthermore, uptake of such genes could provide a mechanism for facilitated recovery from DNA damage after genotoxic stress. PMID:14960717

  17. Pi class glutathione S-transferase genes are regulated by Nrf 2 through an evolutionarily conserved regulatory element in zebrafish

    PubMed Central

    Suzuki, Takafumi; Takagi, Yaeko; Osanai, Hitoshi; Li, Li; Takeuchi, Miki; Katoh, Yasutake; Kobayashi, Makoto; Yamamoto, Masayuki

    2005-01-01

    Pi class GSTs (glutathione S-transferases) are a member of the vertebrate GST family of proteins that catalyse the conjugation of GSH to electrophilic compounds. The expression of Pi class GST genes can be induced by exposure to electrophiles. We demonstrated previously that the transcription factor Nrf 2 (NF-E2 p45-related factor 2) mediates this induction, not only in mammals, but also in fish. In the present study, we have isolated the genomic region of zebrafish containing the genes gstp1 and gstp2. The regulatory regions of zebrafish gstp1 and gstp2 have been examined by GFP (green fluorescent protein)-reporter gene analyses using microinjection into zebrafish embryos. Deletion and point-mutation analyses of the gstp1 promoter showed that an ARE (antioxidant-responsive element)-like sequence is located 50 bp upstream of the transcription initiation site which is essential for Nrf 2 transactivation. Using EMSA (electrophoretic mobility-shift assay) analysis we showed that zebrafish Nrf 2–MafK heterodimer specifically bound to this sequence. All the vertebrate Pi class GST genes harbour a similar ARE-like sequence in their promoter regions. We propose that this sequence is a conserved target site for Nrf 2 in the Pi class GST genes. PMID:15654768

  18. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    PubMed

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  19. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed

    PubMed Central

    Johler, Sophia; Sihto, Henna-Maria; Macori, Guerrino; Stephan, Roger

    2016-01-01

    Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences. PMID:27258311

  20. Sequence Variability in Staphylococcal Enterotoxin Genes seb, sec, and sed.

    PubMed

    Johler, Sophia; Sihto, Henna-Maria; Macori, Guerrino; Stephan, Roger

    2016-01-01

    Ingestion of staphylococcal enterotoxins preformed by Staphylococcus aureus in food leads to staphylococcal food poisoning, the most prevalent foodborne intoxication worldwide. There are five major staphylococcal enterotoxins: SEA, SEB, SEC, SED, and SEE. While variants of these toxins have been described and were linked to specific hosts or levels or enterotoxin production, data on sequence variation is still limited. In this study, we aim to extend the knowledge on promoter and gene variants of the major enterotoxins SEB, SEC, and SED. To this end, we determined seb, sec, and sed promoter and gene sequences of a well-characterized set of enterotoxigenic Staphylococcus aureus strains originating from foodborne outbreaks, human infections, human nasal colonization, rabbits, and cattle. New nucleotide sequence variants were detected for all three enterotoxins and a novel amino acid sequence variant of SED was detected in a strain associated with human nasal colonization. While the seb promoter and gene sequences exhibited a high degree of variability, the sec and sed promoter and gene were more conserved. Interestingly, a truncated variant of sed was detected in all tested sed harboring rabbit strains. The generated data represents a further step towards improved understanding of strain-specific differences in enterotoxin expression and host-specific variation in enterotoxin sequences.

  1. Molecular dissection of cis-acting regulatory elements from 5'-proximal regions of a vaccinia virus late gene cluster.

    PubMed

    Miner, J N; Weinrich, S L; Hruby, D E

    1988-01-01

    Promoter elements responsible for directing the transcription of six tightly clustered vaccinia virus (VV) late genes (open reading frames [ORFs] D11, D12, D13, A1, A2, and A3) from the HindIII D/A region of the viral genome were identified within the upstream sequences proximal to each individual locus. These regions were identified as promoters by excising them from the VV genome, abutting them to the bacterial chloramphenicol acetyl transferase gene, and demonstrating their ability to drive expression of the reporter gene in transient-expression assays in an orientation-specific manner. To delineate the 5' boundary of the upstream elements, two of the VV late gene (A1 and D13) promoter: CAT constructs were subjected to deletion mutagenesis procedures. A series of 5' deletions of the ORF A1 promoter from -114 to -24 showed no reduction in promoter activity, whereas additional deletion of the sequences from -24 to +2 resulted in the complete loss of activity. Deletion of the ORF A1 fragment from -114 to -104 resulted in a 24% increase in activity, suggesting the presence of a negative regulatory region. In marked contrast to previous 5' deletion analyses which have identified VV late promoters as 20- to 30-base-pair cap-proximal sequences, 5' deletions to define the upstream boundary of the ORF D13 promoter identified two positive regulatory regions, the first between -235 and -170 and the second between -123 and -106. Background levels of chloramphenicol acetyltransferase expression were obtained with deletions past -88. Significantly, this places the ORF D13 regulatory regions within the upstream coding sequences of the ORF A1. A high-stringency computer search for homologies between VV late promoters that have been thus far characterized was carried out. Several potential consensus sequences were found just upstream from RNA start sites of temporally related promoter elements. Three major conclusions are drawn from these experiments. (i) The presence of

  2. Vitamin C deficiency improves somatic embryo development through distinct gene regulatory networks in Arabidopsis

    PubMed Central

    Becker, Michael G.; Chan, Ainsley; Mao, Xingyu; Girard, Ian J.; Lee, Samantha; Elhiti, Mohamed; Stasolla, Claudio; Belmonte, Mark F.

    2014-01-01

    Changes in the endogenous ascorbate redox status through genetic manipulation of cellular ascorbate levels were shown to accelerate cell proliferation during the induction phase and improve maturation of somatic embryos in Arabidopsis. Mutants defective in ascorbate biosynthesis such as vtc2-5 contained ~70 % less cellular ascorbate compared with their wild-type (WT; Columbia-0) counterparts. Depletion of cellular ascorbate accelerated cell division processes and cellular reorganization and improved the number and quality of mature somatic embryos grown in culture by 6-fold compared with WT tissues. To gain insight into the molecular mechanisms underlying somatic embryogenesis (SE), we profiled dynamic changes in the transcriptome and analysed dominant patterns of gene activity in the WT and vtc2-5 lines across the somatic embryo culturing process. Our results provide insight into the gene regulatory networks controlling SE in Arabidopsis based on the association of transcription factors with DNA sequence motifs enriched in biological processes of large co-expressed gene sets. These data provide the first detailed account of temporal changes in the somatic embryo transcriptome starting with the zygotic embryo, through tissue dedifferentiation, and ending with the mature somatic embryo, and impart insight into possible mechanisms for the improved culture of somatic embryos in the vtc2-5 mutant line. PMID:25151615

  3. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    PubMed Central

    Santra, Tapesh

    2014-01-01

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances. PMID:25152886

  4. Cell type-selective disease-association of genes under high regulatory load

    PubMed Central

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-01-01

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3′ UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. PMID:26338775

  5. Cell type-selective disease-association of genes under high regulatory load.

    PubMed

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-10-15

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3' UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner.

  6. The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences

    PubMed Central

    Kensche, Philip Reiner; Hoeijmakers, Wieteke Anna Maria; Toenhake, Christa Geeke; Bras, Maaike; Chappell, Lia; Berriman, Matthew; Bártfai, Richárd

    2016-01-01

    In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen. PMID:26578577

  7. The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences.

    PubMed

    Kensche, Philip Reiner; Hoeijmakers, Wieteke Anna Maria; Toenhake, Christa Geeke; Bras, Maaike; Chappell, Lia; Berriman, Matthew; Bártfai, Richárd

    2016-03-18

    In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen.

  8. Molecular cloning and characterization of a chlorophyll degradation regulatory gene (ZjSGR) from Zoysia japonica.

    PubMed

    Teng, K; Chang, Z H; Xiao, G Z; Guo, W E; Xu, L X; Chao, Y H; Han, L B

    2016-01-01

    The stay-green gene (SGR) is a key regulatory factor for chlorophyll degradation and senescence. However, to date, little is known about SGR in Zoysia japonica. In this study, ZjSGR was cloned, using rapid amplification of cDNA ends-polymerase chain reaction (PCR). The target sequence is 831 bp in length, corresponding to 276 amino acids. Protein BLAST results showed that ZjSGR belongs to the stay-green superfamily. A phylogenetic analysis implied that ZjSGR is most closely related to ZmSGR1. The subcellular localization of ZjSGR was investigated, using an Agrobacterium-mediated transient expression assay in Nicotiana benthamiana. Our results demonstrated that ZjSGR protein is localized in the chloroplasts. Quantitative real time PCR was carried out to investigate the expression characteristics of ZjSGR. The expression level of ZjSGR was found to be highest in leaves, and could be strongly induced by natural senescence, darkness, abscisic acid (ABA), and methyl jasmonate treatment. Moreover, an in vivo function analysis indicated that transient overexpression of ZjSGR could accelerate chlorophyll degradation, up-regulate the expression of SAG113, and activate ABA biosynthesis. Taken together, these results provide evidence that ZjSGR could play an important regulatory role in leaf chlorophyll degradation and senescence in plants at the molecular level. PMID:27173268

  9. Transcriptome Analysis of an Insecticide Resistant Housefly Strain: Insights about SNPs and Regulatory Elements in Cytochrome P450 Genes

    PubMed Central

    Asp, Torben; Kristensen, Michael

    2016-01-01

    Background Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. Results The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Conclusion Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s

  10. Gene regulatory network inference using fused LASSO on multiple data sets.

    PubMed

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M O; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-02-11

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.

  11. Gene regulatory network inference using fused LASSO on multiple data sets

    PubMed Central

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-01-01

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687

  12. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference

    PubMed Central

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-01-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η2 (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.

  13. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

    PubMed

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-12-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.

  14. Large scale gene regulatory network inference with a multi-level strategy.

    PubMed

    Wu, Jun; Zhao, Xiaodong; Lin, Zongli; Shao, Zhifeng

    2016-02-01

    Transcriptional regulation is a basis of many crucial molecular processes and an accurate inference of the gene regulatory network is a helpful and essential task to understand cell functions and gain insights into biological processes of interest in systems biology. Inspired by the Dialogue for Reverse Engineering Assessments and Methods (DREAM) projects, many excellent gene regulatory network inference algorithms have been proposed. However, it is still a challenging problem to infer a gene regulatory network from gene expression data on a large scale. In this paper, we propose a gene regulatory network inference method based on a multi-level strategy (GENIMS), which can give results that are more accurate and robust than the state-of-the-art methods. The proposed method mainly consists of three levels, which are an original feature selection step based on guided regularized random forest, normalization of individual feature selection and the final refinement step according to the topological property of the gene regulatory network. To prove the accuracy and robustness of our method, we compare our method with the state-of-the-art methods on the DREAM4 and DREAM5 benchmark networks and the results indicate that the proposed method can significantly improve the performance of gene regulatory network inference. Additionally, we also discuss the influence of the selection of different parameters in our method. PMID:26687446

  15. Reverse engineering and analysis of genome-wide gene regulatory networks from gene expression profiles using high-performance computing.

    PubMed

    Belcastro, Vincenzo; Gregoretti, Francesco; Siciliano, Velia; Santoro, Michele; D'Angelo, Giovanni; Oliva, Gennaro; di Bernardo, Diego

    2012-01-01

    Regulation of gene expression is a carefully regulated phenomenon in the cell. “Reverse-engineering” algorithms try to reconstruct the regulatory interactions among genes from genome-scale measurements of gene expression profiles (microarrays). Mammalian cells express tens of thousands of genes; hence, hundreds of gene expression profiles are necessary in order to have acceptable statistical evidence of interactions between genes. As the number of profiles to be analyzed increases, so do computational costs and memory requirements. In this work, we designed and developed a parallel computing algorithm to reverse-engineer genome-scale gene regulatory networks from thousands of gene expression profiles. The algorithm is based on computing pairwise Mutual Information between each gene-pair. We successfully tested it to reverse engineer the Mus Musculus (mouse) gene regulatory network in liver from gene expression profiles collected from a public repository. A parallel hierarchical clustering algorithm was implemented to discover “communities” within the gene network. Network communities are enriched for genes involved in the same biological functions. The inferred network was used to identify two mitochondrial proteins.

  16. The Association between Infants' Self-Regulatory Behavior and MAOA Gene Polymorphism

    ERIC Educational Resources Information Center

    Zhang, Minghao; Chen, Xinyin; Way, Niobe; Yoshikawa, Hirokazu; Deng, Huihua; Ke, Xiaoyan; Yu, Weiwei; Chen, Ping; He, Chuan; Chi, Xia; Lu, Zuhong

    2011-01-01

    Self-regulatory behavior in early childhood is an important characteristic that has considerable implications for the development of adaptive and maladaptive functioning. The present study investigated the relations between a functional polymorphism in the upstream region of monoamine oxidase A gene (MAOA) and self-regulatory behavior in a sample…

  17. Natural selection on coding and noncoding DNA sequences is associated with virulence genes in a plant pathogenic fungus.

    PubMed

    Rech, Gabriel E; Sanz-Martín, José M; Anisimova, Maria; Sukno, Serenella A; Thon, Michael R

    2014-09-04

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5' untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen.

  18. Natural Selection on Coding and Noncoding DNA Sequences Is Associated with Virulence Genes in a Plant Pathogenic Fungus

    PubMed Central

    Rech, Gabriel E.; Sanz-Martín, José M.; Anisimova, Maria; Sukno, Serenella A.; Thon, Michael R.

    2014-01-01

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5′ untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen. PMID:25193312

  19. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants.

    PubMed

    Van de Velde, Jan; Van Bel, Michiel; Vaneechoutte, Dries; Vandepoele, Klaas

    2016-08-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  20. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  1. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  2. A saturation screen for cis-acting regulatory DNA in the Hox genes of Ciona intestinalis

    SciTech Connect

    Keys, David N.; Lee, Byung-in; Di Gregorio, Anna; Harafuji, Naoe; Detter, Chris; Wang, Mei; Kahsai, Orsalem; Ahn, Sylvia; Arellano, Andre; Zhang, Quin; Trong, Stephan; Doyle, Sharon A.; Satoh, Noriyuki; Satou, Yutaka; Saiga, Hidetoshi; Christian, Allen; Rokhsar, Dan; Hawkins, Trevor L.; Levine, Mike; Richardson, Paul

    2005-01-05

    A screen for the systematic identification of cis-regulatory elements within large (>100 kb) genomic domains containing Hox genes was performed by using the basal chordate Ciona intestinalis. Randomly generated DNA fragments from bacterial artificial chromosomes containing two clusters of Hox genes were inserted into a vector upstream of a minimal promoter and lacZ reporter gene. A total of 222 resultant fusion genes were separately electroporated into fertilized eggs, and their regulatory activities were monitored in larvae. In sum, 21 separable cis-regulatory elements were found. These include eight Hox linked domains that drive expression in nested anterior-posterior domains of ectodermally derived tissues. In addition to vertebrate-like CNS regulation, the discovery of cis-regulatory domains that drive epidermal transcription suggests that C. intestinalis has arthropod-like Hox patterning in the epidermis.

  3. Sequences contained within the promoter of the human thymidine kinase gene can direct cell-cycle regulation of heterologous fusion genes.

    PubMed Central

    Kim, Y K; Wells, S; Lau, Y F; Lee, A S

    1988-01-01

    Recent evidence on the transcriptional regulation of the human thymidine kinase (TK) gene raises the possibility that cell-cycle regulatory sequences may be localized within its promoter. A hybrid gene that combines the TK 5' flanking sequence and the coding region of the bacterial neomycin-resistance gene (neo) has been constructed. Upon transfection into a hamster fibroblast cell line K12, the hybrid gene exhibits cell-cycle-dependent expression. Deletion analysis reveals that the region important for cell-cycle regulation is within -441 to -63 nucleotides from the transcriptional initiation site. This region (-441 to -63) also confers cell-cycle regulation to the herpes simplex virus thymidine kinase (HSVtk) promoter, which is not expressed in a cell-cycle manner. We conclude that the -441 to -63 sequence within the human TK promoter is important for cell-cycle-dependent expression. Images PMID:3413063

  4. Targeting of AID-mediated sequence diversification to immunoglobulin genes.

    PubMed

    Kothapalli, Naga Rama; Fugmann, Sebastian D

    2011-04-01

    Activation-induced cytidine deaminase (AID) is a key enzyme for antibody-mediated immune responses. Antibodies are encoded by the immunoglobulin genes and AID acts as a transcription-dependent DNA mutator on these genes to improve antibody affinity and effector functions. An emerging theme in field is that many transcribed genes are potential targets of AID, presenting an obvious danger to genomic integrity. Thus there are mechanisms in place to ensure that mutagenic outcomes of AID activity are specifically restricted to the immunoglobulin loci. Cis-regulatory targeting elements mediate this effect and their mode of action is probably a combination of immunoglobulin gene specific activation of AID and a perversion of faithful DNA repair towards error-prone outcomes.

  5. Identifying Gene Regulatory Networks in Arabidopsis by In Silico Prediction, Yeast-1-Hybrid, and Inducible Gene Profiling Assays.

    PubMed

    Sparks, Erin E; Benfey, Philip N

    2016-01-01

    A system-wide understanding of gene regulation will provide deep insights into plant development and physiology. In this chapter we describe a threefold approach to identify the gene regulatory networks in Arabidopsis thaliana that function in a specific tissue or biological process. Since no single method is sufficient to establish comprehensive and high-confidence gene regulatory networks, we focus on the integration of three approaches. First, we describe an in silico prediction method of transcription factor-DNA binding, then an in vivo assay of transcription factor-DNA binding by yeast-1-hybrid and lastly the identification of co-expression clusters by transcription factor induction in planta. Each of these methods provides a unique tool to advance our understanding of gene regulation, and together provide a robust model for the generation of gene regulatory networks.

  6. Mechanistic Explanations for Restricted Evolutionary Paths That Emerge from Gene Regulatory Networks

    PubMed Central

    Cotterell, James; Sharpe, James

    2013-01-01

    The extent and the nature of the constraints to evolutionary trajectories are central issues in biology. Constraints can be the result of systems dynamics causing a non-linear mapping between genotype and phenotype. How prevalent are these developmental constraints and what is their mechanistic basis? Although this has been extensively explored at the level of epistatic interactions between nucleotides within a gene, or amino acids within a protein, selection acts at the level of the whole organism, and therefore epistasis between disparate genes in the genome is expected due to their functional interactions within gene regulatory networks (GRNs) which are responsible for many aspects of organismal phenotype. Here we explore epistasis within GRNs capable of performing a common developmental function – converting a continuous morphogen input into discrete spatial domains. By exploring the full complement of GRN wiring designs that are able to perform this function, we analyzed all possible mutational routes between functional GRNs. Through this study we demonstrate that mechanistic constraints are common for GRNs that perform even a simple function. We demonstrate a common mechanistic cause for such a constraint involving complementation between counter-balanced gene-gene interactions. Furthermore we show how such constraints can be bypassed by means of “permissive” mutations that buffer changes in a direct route between two GRN topologies that would normally be unviable. We show that such bypasses are common and thus we suggest that unlike what was observed in protein sequence-function relationships, the “tape of life” is less reproducible when one considers higher levels of biological organization. PMID:23613807

  7. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene.

    PubMed

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the 'CCCGCC' motif in the GFP coding sequence. PMID:27193250

  8. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    PubMed Central

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  9. Molecular cloning, sequencing analysis, and chromosomal localization of the human protease inhibitor 4 (Kallistatin) gene (P14)

    SciTech Connect

    Chai, K.X.; Chao, J.; Chao, L.; Ward, D.C.

    1994-09-15

    The gene encoding human protease inhibitor 4 (kallistatin; gene symbol PI4), a novel serine proteinase inhibitor (serpin), has been isolated and completely sequenced. The kallistatin gene is 9618 bp in length and contains five exons and four introns. The structure and organization of the kallistatin gene are similar to those of the genes encoding {alpha}{sub 1}-antichymotrypsin. The kallistatin gene is also similar to the genes encoding rat and mouse kallikrein-binding proteins. The first exon of the kallistatin gene is a noncoding 89-bp fragment, as determined by primer extension. The fifth exon, which contains 308 bp of noncoding sequence, encodes the reactive center of kallistatin. In the 5`-flanking region of the kallistatin gene, 1125 bp have been sequenced and a consensus promoter segment with potential transcription regulatory sites, including CAAT and TATA boxes, an AP-2 binding site, a GC-rich region, a cAMP response element, and an AP-1 binding site, has been identified within this region. The kallistatin gene was localized by in situ hybridization to human chromosome 14q31-132.1, close to the serpin genes encoding {alpha}{sub 1}-antichymotrypsin, protein C inhibitor, {alpha}{sub 1}-antitrypsin, and corticosteroid-binding globulin. In a genomic DNA Southern blot, kallistatin-related genes were identified in monkey, mouse, rat, bovine, dog, cat, and a ground mole. The patterns of hybridization revealed clues of human serpin evolution. 34 refs., 6 figs.

  10. Inferring gene regulatory networks via nonlinear state-space models and exploiting sparsity.

    PubMed

    Noor, Amina; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem N

    2012-01-01

    This paper considers the problem of learning the structure of gene regulatory networks from gene expression time series data. A more realistic scenario when the state space model representing a gene network evolves nonlinearly is considered while a linear model is assumed for the microarray data. To capture the nonlinearity, a particle filter-based state estimation algorithm is considered instead of the contemporary linear approximation-based approaches. The parameters characterizing the regulatory relations among various genes are estimated online using a Kalman filter. Since a particular gene interacts with a few other genes only, the parameter vector is expected to be sparse. The state estimates delivered by the particle filter and the observed microarray data are then subjected to a LASSO-based least squares regression operation which yields a parsimonious and efficient description of the regulatory network by setting the irrelevant coefficients to zero. The performance of the aforementioned algorithm is compared with the extended Kalman filter (EKF) and Unscented Kalman Filter (UKF) employing the Mean Square Error (MSE) as the fidelity criterion in recovering the parameters of gene regulatory networks from synthetic data and real biological data. Extensive computer simulations illustrate that the proposed particle filter-based network inference algorithm outperforms EKF and UKF, and therefore, it can serve as a natural framework for modeling gene regulatory networks with nonlinear and sparse structure. PMID:22350207

  11. Identification of a regulatory domain controlling the Nppa-Nppb gene cluster during heart development and stress.

    PubMed

    Sergeeva, Irina A; Hooijkaas, Ingeborg B; Ruijter, Jan M; van der Made, Ingeborg; de Groot, Nina E; van de Werken, Harmen J G; Creemers, Esther E; Christoffels, Vincent M

    2016-06-15

    The paralogous genes Nppa and Nppb are organized in an evolutionarily conserved cluster and provide a valuable model for studying co-regulation and regulatory landscape organization during heart development and disease. Here, we analyzed the chromatin conformation, epigenetic status and enhancer potential of sequences of the Nppa-Nppb cluster in vivo Our data indicate that the regulatory landscape of the cluster is present within a 60-kb domain centered around Nppb Both promoters and several potential regulatory elements interact with each other in a similar manner in different tissues and developmental stages. The distribution of H3K27ac and the association of Pol2 across the locus changed during cardiac hypertrophy, revealing their potential involvement in stress-mediated gene regulation. Functional analysis of double-reporter transgenic mice revealed that Nppa and Nppb share developmental, but not stress-response, enhancers, responsible for their co-regulation. Moreover, the Nppb promoter was required, but not sufficient, for hypertrophy-induced Nppa expression. In summary, the developmental regulation and stress response of the Nppa-Nppb cluster involve the concerted action of multiple enhancers and epigenetic changes distributed across a structurally rigid regulatory domain. PMID:27048739

  12. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  13. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage.

  14. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  15. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  16. Promoter-like sequences regulating transcriptional activity in neurexin and neuroligin genes.

    PubMed

    Runkel, Fabian; Rohlmann, Astrid; Reissner, Carsten; Brand, Stefan-Martin; Missler, Markus

    2013-10-01

    Synapse function requires the cell-adhesion molecules neurexins (Nrxn) and neuroligins (Nlgn). Although these molecules are essential for neurotransmission and prefer distinct isoform combinations for interaction, little is known about their transcriptional regulation. Here, we started to explore this important aspect because expression of Nrxn1-3 and Nlgn1-3 genes is altered in mice lacking the transcriptional regulator methyl-CpG-binding protein2 (MeCP2). Since MeCP2 can bind to methylated CpG-dinucleotides and Nrxn/Nlgn contain CpG-islands, we tested genomic sequences for transcriptional activity in reporter gene assays. We found that their influence on transcription are differentially activating or inhibiting. As we observed an activity difference between heterologous and neuronal cell lines for distinct Nrxn1 and Nlgn2 sequences, we dissected their putative promoter regions. In both genes, we identify regions in exon1 that can induce transcription, in addition to the alternative transcriptional start points in exon2. While the 5'-regions of Nrxn1 and Nlgn2 contain two CpG-rich elements that show distinct methylation frequency and binding to MeCP2, other regions may act independently of this transcriptional regulator. These data provide first insights into regulatory sequences of Nrxn and Nlgn genes that may represent an important aspect of their function at synapses in health and disease.

  17. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  18. Widespread contribution of transposable elements to the innovation of gene regulatory networks.

    PubMed

    Sundaram, Vasavi; Cheng, Yong; Ma, Zhihai; Li, Daofeng; Xing, Xiaoyun; Edge, Peter; Snyder, Michael P; Wang, Ting

    2014-12-01

    Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF-TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution.

  19. Widespread contribution of transposable elements to the innovation of gene regulatory networks

    PubMed Central

    Sundaram, Vasavi; Cheng, Yong; Ma, Zhihai; Li, Daofeng; Xing, Xiaoyun; Edge, Peter

    2014-01-01

    Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF–TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution. PMID:25319995

  20. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data

    PubMed Central

    Faria, José P.; Overbeek, Ross; Taylor, Ronald C.; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S.

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental

  1. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data.

    PubMed

    Faria, José P; Overbeek, Ross; Taylor, Ronald C; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same "ON" and "OFF" gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions

  2. Combining Hi-C data with phylogenetic correlation to predict the target genes of distal regulatory elements in human genome.

    PubMed

    Lu, Yulan; Zhou, Yuanpeng; Tian, Weidong

    2013-12-01

    Defining the target genes of distal regulatory elements (DREs), such as enhancer, repressors and insulators, is a challenging task. The recently developed Hi-C technology is designed to capture chromosome conformation structure by high-throughput sequencing, and can be potentially used to determine the target genes of DREs. However, Hi-C data are noisy, making it difficult to directly use Hi-C data to identify DRE-target gene relationships. In this study, we show that DREs-gene pairs that are confirmed by Hi-C data are strongly phylogenetic correlated, and have thus developed a method that combines Hi-C read counts with phylogenetic correlation to predict long-range DRE-target gene relationships. Analysis of predicted DRE-target gene pairs shows that genes regulated by large number of DREs tend to have essential functions, and genes regulated by the same DREs tend to be functionally related and co-expressed. In addition, we show with a couple of examples that the predicted target genes of DREs can help explain the causal roles of disease-associated single-nucleotide polymorphisms located in the DREs. As such, these predictions will be of importance not only for our understanding of the function of DREs but also for elucidating the causal roles of disease-associated noncoding single-nucleotide polymorphisms.

  3. Tissue Specificity and Sex-Specific Regulatory Variation Permit the Evolution of Sex-Biased Gene Expression.

    PubMed

    Dean, Rebecca; Mank, Judith E

    2016-09-01

    Genetic correlations between males and females are often thought to constrain the evolution of sexual dimorphism. However, sexually dimorphic traits and the underlying sexually dimorphic gene expression patterns are often rapidly evolving. We explore this apparent paradox by measuring the genetic correlation in gene expression between males and females (Cmf) across broad evolutionary timescales, using two RNA-sequencing data sets spanning multiple populations and multiple species. We find that unbiased genes have higher Cmf than sex-biased genes, consistent with intersexual genetic correlations constraining the evolution of sexual dimorphism. However, we found that highly sex-biased genes (both male and female biased) also had higher tissue specificity, and unbiased genes had greater expression breadth, suggesting that pleiotropy may constrain the breakdown of intersexual genetic correlations. Finally, we show that genes with high Cmf showed some degree of sex-specific changes in gene expression in males and females. Together, our results suggest that genetic correlations between males and females may be less important in constraining the evolution of sex-biased gene expression than pleiotropy. Sex-specific regulatory variation and tissue specificity may resolve the paradox of widespread sex bias within a largely shared genome.

  4. A validated gene regulatory network and GWAS identifies early regulators of T cell-associated diseases.

    PubMed

    Gustafsson, Mika; Gawel, Danuta R; Alfredsson, Lars; Baranzini, Sergio; Björkander, Janne; Blomgran, Robert; Hellberg, Sandra; Eklund, Daniel; Ernerudh, Jan; Kockum, Ingrid; Konstantinell, Aelita; Lahesmaa, Riita; Lentini, Antonio; Liljenström, H Robert I; Mattson, Lina; Matussek, Andreas; Mellergård, Johan; Mendez, Melissa; Olsson, Tomas; Pujana, Miguel A; Rasool, Omid; Serra-Musach, Jordi; Stenmarker, Margaretha; Tripathi, Subhash; Viitala, Miro; Wang, Hui; Zhang, Huan; Nestor, Colm E; Benson, Mikael

    2015-11-11

    Early regulators of disease may increase understanding of disease mechanisms and serve as markers for presymptomatic diagnosis and treatment. However, early regulators are difficult to identify because patients generally present after they are symptomatic. We hypothesized that early regulators of T cell-associated diseases could be found by identifying upstream transcription factors (TFs) in T cell differentiation and by prioritizing hub TFs that were enriched for disease-associated polymorphisms. A gene regulatory network (GRN) was constructed by time series profiling of the transcriptomes and methylomes of human CD4(+) T cells during in vitro differentiation into four helper T cell lineages, in combination with sequence-based TF binding predictions. The TFs GATA3, MAF, and MYB were identified as early regulators and validated by ChIP-seq (chromatin immunoprecipitation sequencing) and small interfering RNA knockdowns. Differential mRNA expression of the TFs and their targets in T cell-associated diseases supports their clinical relevance. To directly test if the TFs were altered early in disease, T cells from patients with two T cell-mediated diseases, multiple sclerosis and seasonal allergic rhinitis, were analyzed. Strikingly, the TFs were differentially expressed during asymptomatic stages of both diseases, whereas their targets showed altered expression during symptomatic stages. This analytical strategy to identify early regulators of disease by combining GRNs with genome-wide association studies may be generally applicable for functional and clinical studies of early disease development. PMID:26560356

  5. Sieve-based relation extraction of gene regulatory networks from biological literature

    PubMed Central

    2015-01-01

    Background Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. Results We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice

  6. Variable Genome Sequences of the Murine Pneumotropic Virus (Polyomaviridae) Regulatory Region Isolated from an Infected Mouse Tissue Viral Suspension

    PubMed Central

    Libbey, Jane E.

    2016-01-01

    The murine pneumotropic virus genome, isolated from an infected murine tissue homogenate, was sequenced to completion. The lungs, liver, spleen, and kidneys were the source of the tissue homogenate in order to mirror the heterogeneity of the virus population in vivo. The regulatory region sequence was found to be highly variable. PMID:27231357

  7. Conserved sequences in both coding and 5' flanking regions of mammalian opal suppressor tRNA genes.

    PubMed Central

    Pratt, K; Eden, F C; You, K H; O'Neill, V A; Hatfield, D

    1985-01-01

    The rabbit genome encodes an opal suppressor tRNA gene. The coding region is strictly conserved between the rabbit gene and the corresponding gene in the human genome. The rabbit opal suppressor gene contains the consensus sequence in the 3' internal control region but like the human and chicken genes, the rabbit 5' internal control region contains two additional nucleotides. The 5' flanking sequences of the rabbit and the human opal suppressor genes contain extensive regions of homology. A subset of these homologies is also present 5' to the chicken opal suppressor gene. Both the rabbit and the human genomes also encode a pseudogene. That of the rabbit lacks the 3' half of the coding region. Neither pseudogene has homologous regions to the 5' flanking regions of the genes. The presence of 5' homologies flanking only the transcribed genes and not the pseudogenes suggests that these regions may be regulatory control elements specifically involved in the expression of the eukaryotic opal suppressor gene. Moreover the strict conservation of coding sequences indicates functional importance for the opal suppressor tRNA genes. Images PMID:4022772

  8. Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice

    PubMed Central

    2013-01-01

    Background The next big challenge in human genetics is understanding the 98% of the genome that comprises non-coding DNA. Hidden in this DNA are sequences critical for gene regulation, and new experimental strategies are needed to understand the functional role of gene-regulation sequences in health and disease. In this study, we build upon our HuGX ('high-throughput human genes on the X chromosome’) strategy to expand our understanding of human gene regulation in vivo. Results In all, ten human genes known to express in therapeutically important brain regions were chosen for study. For eight of these genes, human bacterial artificial chromosome clones were identified, retrofitted with a reporter, knocked single-copy into the Hprt locus in mouse embryonic stem cells, and mouse strains derived. Five of these human genes expressed in mouse, and all expressed in the adult brain region for which they were chosen. This defined the boundaries of the genomic DNA sufficient for brain expression, and refined our knowledge regarding the complexity of gene regulation. We also characterized for the first time the expression of human MAOA and NR2F2, two genes for which the mouse homologs have been extensively studied in the central nervous system (CNS), and AMOTL1 and NOV, for which roles in CNS have been unclear. Conclusions We have demonstrated the use of the HuGX strategy to functionally delineate non-coding-regulatory regions of therapeutically important human brain genes. Our results also show that a careful investigation, using publicly available resources and bioinformatics, can lead to accurate predictions of gene expression. PMID:24124870

  9. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  10. Identification of genes involved in regulatory mechanism of pigments in broiler chickens.

    PubMed

    Tarique, T M; Yang, S; Mohsina, Z; Qiu, J; Yan, Z; Chen, G; Chen, A

    2014-01-01

    Chicken is an important model organism that unites the evolutionary gap between mammals and other vertebrates and provide major source of protein from meat and eggs for all over the world population. However, specific genes underlying the regulatory mechanism of broiler pigmentation have not yet been determined. In order to better understand the genes involved in the mechanism of pigmentation in the muscle tissues of broilers, the Affymetrix microarray hybridization experiment platform was used to identify gene expression profiles at 7 weeks of age. Broilers fed canthaxanthin, natural lutein, and orangeII pigments (100 mg/kg) were used to explore gene expression profiles). Our data showed that the 7th week of age was a very important phase with regard to gene expression profiles. We identified a number of differentially expressed genes; in canthaxanthin, natural lutein, and orangeII, there were 54 (32 upregulated and 22 downregulated), 23 (15 upregulated and 8 downregulated), and 7 (5 upregulated and 2 downregulated) known genes, respectively. Our data indicate that the numbers of differentially expressed genes were more upregulated than downregulated, and several genes showed conserved signaling to previously known functions. Thus, functional characterization of differentially expressed genes revealed several categories that are involved in important biological processes, including pigmentation, growth, molecular mechanisms, fat metabolism, cell proliferation, immune response, lipid metabolism, and protein synthesis and degradation. The results of the present study demonstrate that the genes associated with canthaxanthin, natural lutein, and orangeII are key regulatory genes that control the regulatory mechanisms of pigmentation.

  11. Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution

    PubMed Central

    Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M.; Imumorin, Ikhide G.; Peters, Sunday O.; Zhang, Jiajin; Dong, Yang; Wang, Wen

    2016-01-01

    The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats’ populations. Fu and Li tests were significantly positive but Tajima’s D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat

  12. Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution.

    PubMed

    Okpeku, Moses; Esmailizadeh, Ali; Adeola, Adeniyi C; Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M; Imumorin, Ikhide G; Peters, Sunday O; Zhang, Jiajin; Dong, Yang; Wang, Wen

    2016-01-01

    The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats' populations. Fu and Li tests were significantly positive but Tajima's D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat IRF3

  13. Gene regulatory evolution and the origin of macroevolutionary novelties: insights from the neural crest.

    PubMed

    Van Otterloo, Eric; Cornell, Robert A; Medeiros, Daniel Meulemans; Garnett, Aaron T

    2013-07-01

    The appearance of novel anatomic structures during evolution is driven by changes to the networks of transcription factors, signaling pathways, and downstream effector genes controlling development. The nature of the changes to these developmental gene regulatory networks (GRNs) is poorly understood. A striking test case is the evolution of the GRN controlling development of the neural crest (NC). NC cells emerge from the neural plate border (NPB) and contribute to multiple adult structures. While all chordates have a NPB, only in vertebrates do NPB cells express all the genes constituting the neural crest GRN (NC-GRN). Interestingly, invertebrate chordates express orthologs of NC-GRN components in other tissues, revealing that during vertebrate evolution new regulatory connections emerged between transcription factors primitively expressed in the NPB and genes primitively expressed in other tissues. Such interactions could have evolved by two mechanisms. First, transcription factors primitively expressed in the NPB may have evolved new DNA and/or cofactor binding properties (protein neofunctionalization). Alternately, cis-regulatory elements driving NPB expression may have evolved near genes primitively expressed in other tissues (cis-regulatory neofunctionalization). Here we discuss how gene duplication can, in principle, promote either form of neofunctionalization. We review recent published examples of interspecies gene-swap, or regulatory-element-swap, experiments that test both models. Such experiments have yielded little evidence to support the importance of protein neofunctionalization in the emergence of the NC-GRN, but do support the importance of novel cis-regulatory elements in this process. The NC-GRN is an excellent model for the study of gene regulatory and macroevolutionary innovation.

  14. Spliced synthetic genes as internal controls in RNA sequencing experiments.

    PubMed

    Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

    2016-09-01

    RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome. PMID:27502218

  15. Sequence Validation of Candidates for Selectively Important Genes in Sunflower

    PubMed Central

    Chapman, Mark A.; Mandel, Jennifer R.; Burke, John M.

    2013-01-01

    Analyses aimed at identifying genes that have been targeted by past selection provide a powerful means for investigating the molecular basis of adaptive differentiation. In the case of crop plants, such studies have the potential to not only shed light on important evolutionary processes, but also to identify genes of agronomic interest. In this study, we test for evidence of positive selection at the DNA sequence level in a set of candidate genes previously identified in a genome-wide scan for genotypic evidence of selection during the evolution of cultivated sunflower. In the majority of cases, we were able to confirm the effects of selection in shaping diversity at these loci. Notably, the genes that were found to be under selection via our sequence-based analyses were devoid of variation in the cultivated sunflower gene pool. This result confirms a possible strategy for streamlining the search for adaptively-important loci process by pre-screening the derived population to identify the strongest candidates before sequencing them in the ancestral population. PMID:23991009

  16. Nucleotide sequence of the vaccinia virus hemagglutinin gene.

    PubMed

    Shida, H

    1986-04-30

    Vaccinia virus hemagglutinin (HA) is expressed at late time of infection cycle, and it is nonessential for virus growth. Location of the HA structural gene was determined by hybrid-arrested and hybrid-selected translation methods at the right terminus of the HindIII A fragment. The position of the HA gene was confirmed by the production of the complete HA protein in the cells transfected with the plasmid containing that region. Examination of this nucleotide sequence revealed the positions of cleavage sites for a number of restriction endonucleases. The deduced amino acid sequence revealed that the HA protein is a member of typical surface membrane glycoproteins. Comparison of the nucleotide sequence upstream of the HA coding region with corresponding region of other late genes suggested the existence of the consensus decanucleotides TTCATTTa/tGT between 34 to 18 bp upstream to the initiation codon followed by a cluster of A or T, a unique feature of the late genes of vaccinia virus. These results in conjunction with the ease of isolating HA- mutants provide a basis for a new site suitable for inserting foreign genes.

  17. Glycoprotein Gene Sequence Variation in Rhesus Monkey Rhadinovirus

    PubMed Central

    Shin, Young C.; Jones, Leandro R.; Manrique, Julieta; Lauer, William; Carville, Angela; Mansfield, Keith G.; Desrosiers, Ronald C.

    2010-01-01

    Gene sequences for seven glycoproteins from 20 independent isolates of rhesus monkey rhadinovirus (RRV) and of the corresponding seven glycoprotein genes from nine strains of the Kaposi’s sarcoma-associated herpesvirus (KSHV) were obtained and analyzed. Phylogenetic analysis revealed two discrete groupings of RRV gH sequences, two discrete groupings of RRV gL sequences and two discrete groupings of RRV gB sequences. We called these phylogenetic groupings gHa, gHb, gLa, gLb, gBa and gBb. gHa was always paired with gLa and gHb was always paired with gLb for any individual RRV isolate. Since gH and gL are known to be interacting partners, these results suggest the need of matching sequence types for function of these cooperating proteins. gB phylogenetic grouping was not associated with gH/gL phylogenetic grouping. Our results demonstrate two distinct, distantly-related phylogenetic groupings of gH and gL of RRV despite a remarkable degree of sequence conservation within each individual phylogenetic group. PMID:20172576

  18. Isolation of nine gene sequences induced by silica in murine macrophages

    SciTech Connect

    Segade, F.; Claudio, E.; Wrobel, K.; Ramos, S.; Lazo, P.S.

    1995-03-01

    Macrophage activation by silica is the initial step in the development of silicosis. To identify genes that might be involved in silica-mediated activation, RAW 264.7 mouse macrophages were treated with silica for 48 h, and a subtracted cDNA library enriched for silica-induced genes (SIG) was constructed and differently screened. Nine cDNA clones (designated SIG-12, -14, -20, -41, -61, -81, -91, and -111) were partially sequenced and compared with sequences in GenBank/EMBL databases. SIG-12, -14, and -20 corresponded to the genes for ribosomal proteins L13A, L32, and L26, respectively. SIG-61 is the mouse homologue of p21 RhoC. SIG-91 is identical to the 67-kDa high-affinity laminin receptor. Four genes were not identified and are novel. All of the mRNAs corresponding to the nine cloned cDNAs were inducible by silica. Steady-state levels of mRNAs in RAW 264.7 cells treated with various macrophage activators and inducers of signal transduction pathways were determined. A complex pattern of induction and repression was found, indicating that upon phagocytosis of silica particles, many regulatory mechanisms of genes expression are simultaneously triggered. 55 refs., 4 figs., 1 tab.

  19. Rapid Sequence and Expression Divergence Suggest Selection for Novel Function in Primate-Specific KRAB-ZNF Genes

    PubMed Central

    Nowick, Katja; Hamilton, Aaron T.; Zhang, Huimin; Stubbs, Lisa

    2010-01-01

    Recent segmental duplications (SDs), arising from duplication events that occurred within the past 35–40 My, have provided a major resource for the evolution of proteins with primate-specific functions. KRAB zinc finger (KRAB-ZNF) transcription factor genes are overrepresented among genes contained within these recent human SDs. Here, we examine the structural and functional diversity of the 70 human KRAB-ZNF genes involved in the most recent primate SD events including genes that arose in the hominid lineage. Despite their recent advent, many parent–daughter KRAB-ZNF gene pairs display significant differences in zinc finger structure and sequence, expression, and splicing patterns, each of which could significantly alter the regulatory functions of the paralogous genes. Paralogs that emerged on the lineage to humans and chimpanzees have undergone more evolutionary changes per unit of time than genes already present in the common ancestor of rhesus macaques and great apes. Taken together, these data indicate that a substantial fraction of the recently evolved primate-specific KRAB-ZNF gene duplicates have acquired novel functions that may possibly define novel regulatory pathways and suggest an active ongoing selection for regulatory diversity in primates. PMID:20573777

  20. Targeted enrichment of the black cottonwood (Populus trichocarpa) gene space using sequence capture

    PubMed Central

    2012-01-01

    Background High-throughput re-sequencing is rapidly becoming the method of choice for studies of neutral and adaptive processes in natural populations across taxa. As re-sequencing the genome of large numbers of samples is still cost-prohibitive in many cases, methods for genome complexity reduction have been developed in attempts to capture most ecologically-relevant genetic variation. One of these approaches is sequence capture, in which oligonucleotide baits specific to genomic regions of interest are synthesized and used to retrieve and sequence those regions. Results We used sequence capture to re-sequence most predicted exons, their upstream regulatory regions, as well as numerous random genomic intervals in a panel of 48 genotypes of the angiosperm tree Populus trichocarpa (black cottonwood, or ‘poplar’). A total of 20.76Mb (5%) of the poplar genome was targeted, corresponding to 173,040 baits. With 12 indexed samples run in each of four lanes on an Illumina HiSeq instrument (2x100 paired-end), 86.8% of the bait regions were on average sequenced at a depth ≥10X. Few off-target regions (>250bp away from any bait) were present in the data, but on average ~80bp on either side of the baits were captured and sequenced to an acceptable depth (≥10X) to call heterozygous SNPs. Nucleotide diversity estimates within and adjacent to protein-coding genes were similar to those previously reported in Populus spp., while intergenic regions had higher values consistent with a relaxation of selection. Conclusions Our results illustrate the efficiency and utility of sequence capture for re-sequencing highly heterozygous tree genomes, and suggest design considerations to optimize the use of baits in future studies. PMID:23241106

  1. Gene identification in prokaryotic genomes, phages, metagenomes, and EST sequences with GeneMarkS suite.

    PubMed

    Borodovsky, Mark; Lomsadze, Alex

    2014-01-01

    This unit describes how to use several gene-finding programs from the GeneMark line developed for finding protein-coding ORFs in genomic DNA of prokaryotic species, in genomic DNA of eukaryotic species with intronless genes, in genomes of viruses and phages, and in prokaryotic metagenomic sequences, as well as in EST sequences with spliced-out introns. These bioinformatics tools were demonstrated to have state-of-the-art accuracy, and have been frequently used for gene annotation in novel nucleotide sequences. An additional advantage of these sequence-analysis tools is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self-training (unsupervised training). PMID:24510847

  2. Phenotype Sequencing: Identifying the Genes That Cause a Phenotype Directly from Pooled Sequencing of Independent Mutants

    PubMed Central

    Harper, Marc A.; Chen, Zugen; Toy, Traci; Machado, Iara M. P.; Nelson, Stanley F.; Liao, James C.; Lee, Christopher J.

    2011-01-01

    Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50–100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a “Phenotype Sequencing” approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost $7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only $1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only $110–$340. PMID:21364744

  3. High resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies

    PubMed Central

    Rabani, Michal; Raychowdhury, Raktima; Jovanovic, Marko; Rooney, Michael; Stumpo, Deborah J.; Hacohen, Nir; Schier, Alexander F.; Blackshear, Perry J.; Friedman, Nir; Amit, Ido; Regev, Aviv

    2014-01-01

    Summary Cells control dynamic transitions in transcript levels by regulating transcription, processing and/or degradation through an integrated regulatory strategy. Here, we combine RNA metabolic labeling, rRNA-depleted RNA-seq, and DRiLL, a novel computational framework, to quantify the level, editing sites, and transcription, processing and degradation rates of each transcript at a splice junction resolution during the LPS response of mouse dendritic cells. Four key regulatory strategies, dominated by RNA transcription changes, generate most temporal gene expression patterns. Non-canonical strategies that also employ dynamic posttranscriptional regulation control only a minority of genes, but provide unique signal processing features. We validate Tristetraprolin (TTP) as a major regulator of RNA degradation in one non-canonical strategy. Applying DRiLL to the regulation of non-coding RNAs and to zebrafish embryogenesis demonstrates its broad utility. Our study provides a new quantitative approach to discover transcriptional and post-transcriptional events that control dynamic changes in transcript levels using RNA-Seq data. PMID:25497548

  4. Structural basis for regulation of rhizobial nodulation and symbiosis gene expression by the regulatory protein NolR

    PubMed Central

    Lee, Soon Goo; Krishnan, Hari B.; Jez, Joseph M.

    2014-01-01

    The symbiosis between rhizobial microbes and host plants involves the coordinated expression of multiple genes, which leads to nodule formation and nitrogen fixation. As part of the transcriptional machinery for nodulation and symbiosis across a range of Rhizobium, NolR serves as a global regulatory protein. Here, we present the X-ray crystal structures of NolR in the unliganded form and complexed with two different 22-base pair (bp) double-stranded operator sequences (oligos AT and AA). Structural and biochemical analysis of NolR reveals protein–DNA interactions with an asymmetric operator site and defines a mechanism for conformational switching of a key residue (Gln56) to accommodate variation in target DNA sequences from diverse rhizobial genes for nodulation and symbiosis. This conformational switching alters the energetic contributions to DNA binding without changes in affinity for the target sequence. Two possible models for the role of NolR in the regulation of different nodulation and symbiosis genes are proposed. To our knowledge, these studies provide the first structural insight on the regulation of genes involved in the agriculturally and ecologically important symbiosis of microbes and plants that leads to nodule formation and nitrogen fixation. PMID:24733893

  5. Structural basis for regulation of rhizobial nodulation and symbiosis gene expression by the regulatory protein NolR.

    PubMed

    Lee, Soon Goo; Krishnan, Hari B; Jez, Joseph M

    2014-04-29

    The symbiosis between rhizobial microbes and host plants involves the coordinated expression of multiple genes, which leads to nodule formation and nitrogen fixation. As part of the transcriptional machinery for nodulation and symbiosis across a range of Rhizobium, NolR serves as a global regulatory protein. Here, we present the X-ray crystal structures of NolR in the unliganded form and complexed with two different 22-base pair (bp) double-stranded operator sequences (oligos AT and AA). Structural and biochemical analysis of NolR reveals protein-DNA interactions with an asymmetric operator site and defines a mechanism for conformational switching of a key residue (Gln56) to accommodate variation in target DNA sequences from diverse rhizobial genes for nodulation and symbiosis. This conformational switching alters the energetic contributions to DNA binding without changes in affinity for the target sequence. Two possible models for the role of NolR in the regulation of different nodulation and symbiosis genes are proposed. To our knowledge, these studies provide the first structural insight on the regulation of genes involved in the agriculturally and ecologically important symbiosis of microbes and plants that leads to nodule formation and nit