Science.gov

Sample records for gene sequences regulatory

  1. Learning gene regulatory networks from next generation sequencing data.

    PubMed

    Jia, Bochao; Xu, Suwa; Xiao, Guanghua; Lamba, Vishal; Liang, Faming

    2017-03-10

    In recent years, next generation sequencing (NGS) has gradually replaced microarray as the major platform in measuring gene expressions. Compared to microarray, NGS has many advantages, such as less noise and higher throughput. However, the discreteness of NGS data also challenges the existing statistical methodology. In particular, there still lacks an appropriate statistical method for reconstructing gene regulatory networks using NGS data in the literature. The existing local Poisson graphical model method is not consistent and can only infer certain local structures of the network. In this article, we propose a random effect model-based transformation to continuize NGS data and then we transform the continuized data to Gaussian via a semiparametric transformation and apply an equivalent partial correlation selection method to reconstruct gene regulatory networks. The proposed method is consistent. The numerical results indicate that the proposed method can lead to much more accurate inference of gene regulatory networks than the local Poisson graphical model and other existing methods. The proposed data-continuized transformation fills the theoretical gap for how to transform discrete data to continuous data and facilitates NGS data analysis. The proposed data-continuized transformation also makes it feasible to integrate different types of data, such as microarray and RNA-seq data, in reconstruction of gene regulatory networks.

  2. Phylogenetic Relationships and the Evolution of Regulatory Gene Sequences in the Parrotfishes

    PubMed Central

    Smith, Lydia L.; Fessler, Jennifer L.; Alfaro, Michael E.; Streelman, J. Todd; Westneat, Mark W.

    2008-01-01

    Regulatory genes control the expression of other genes and are key components of developmental processes such as segmentation and embryonic construction of the skull in vertebrates. Here we examine the variability and evolution of three vertebrate regulatory genes, addressing issues of their utility for phylogenetics and comparing the rates of genetic change seen in regulatory loci to the rates seen in other genes in the parrotfishes. The parrotfishes are a diverse group of colorful fishes from coral reefs and seagrasses worldwide and have been placed phylogenetically within the family Labridae. We tested phylogenetic hypotheses among the parrotfishes, with a focus on the genera Chlorurus and Scarus, by analyzing eight gene fragments for 42 parrotfishes and eight outgroup species. We sequenced mitochondrial 12s rRNA (967 bp), 16s rRNA (577 bp), and cytochrome b (477 bp). From the nuclear genome, we sequenced part of the protein-coding genes rag2 (715 bp), tmo4c4 (485 bp), and the developmental regulatory genes otx1 (672 bp), bmp4 (488 bp), and dlx2 (522 bp). Bayesian, likelihood, and parsimony analyses on the resulting 4903 bp of DNA sequence produced similar topologies that confirm the monophyly of the scarines and provide a phylogeny at the species level for portions of the genera Scarus and Chlorurus. Four major clades of Scarus were recovered, with three distributed in the Indo-Pacific and one containing Caribbean/Atlantic taxa. Molecular rates suggest a Miocene origin of the parrotfishes (22 mya) and a recent divergence of species within Scarus and Chlorurus, within the past 5 million years. Developmentally important genes made a significant contribution to phylogenetic structure, and rates of genetic evolution were high in bmp4, similar to other coding nuclear genes, but low in otx1 and the dlx2 exons. Synonymous and nonsynonymous substitution patterns in developmental regulatory genes support the hypothesis of stabilizing selection during the history of

  3. High sequence turnover in the regulatory regions of the developmental gene hunchback in insects.

    PubMed

    Hancock, J M; Shaw, P J; Bonneton, F; Dover, G A

    1999-02-01

    Extensive sequence analysis of the developmental gene hunchback and its 5' and 3' regulatory regions in Drosophila melanogaster, Drosophila virilis, Musca domestica, and Tribolium castaneum, using a variety of computer algorithms, reveals regions of high sequence simplicity probably generated by slippage-like mechanisms of turnover. No regions are entirely refractory to the action of slippage, although the density and composition of simple sequence motifs varies from region to region. Interestingly, the 5' and 3' flanking regions share short repetitive motifs despite their separation by the gene itself, and the motifs are different in composition from those in the exons and introns. Furthermore, there are high levels of conservation of motifs in equivalent orthologous regions. Detailed sequence analysis of the P2 promoter and DNA footprinting assays reveal that the number, orientation, sequence, spacing, and protein-binding affinities of the BICOID-binding sites varies between species and that the 'P2' promoter, the nanos response element in the 3' untranslated region, and several conserved boxes of sequence in the gene (e.g., the two zinc-finger regions) are surrounded by cryptically-simple-sequence DNA. We argue that high sequence turnover and genetic redundancy permit both the general maintenance of promoter functions through the establishment of coevolutionary (compensatory) changes in cis- and trans-acting genetic elements and, at the same time, the possibility of subtle changes in the regulation of hunchback in the different species.

  4. Two lamprey Hedgehog genes share non-coding regulatory sequences and expression patterns with gnathostome Hedgehogs.

    PubMed

    Kano, Shungo; Xiao, Jin-Hua; Osório, Joana; Ekker, Marc; Hadzhiev, Yavor; Müller, Ferenc; Casane, Didier; Magdelenat, Ghislaine; Rétaux, Sylvie

    2010-10-13

    Hedgehog (Hh) genes play major roles in animal development and studies of their evolution, expression and function point to major differences among chordates. Here we focused on Hh genes in lampreys in order to characterize the evolution of Hh signalling at the emergence of vertebrates. Screening of a cosmid library of the river lamprey Lampetra fluviatilis and searching the preliminary genome assembly of the sea lamprey Petromyzon marinus indicate that lampreys have two Hh genes, named Hha and Hhb. Phylogenetic analyses suggest that Hha and Hhb are lamprey-specific paralogs closely related to Sonic/Indian Hh genes. Expression analysis indicates that Hha and Hhb are expressed in a Sonic Hh-like pattern. The two transcripts are expressed in largely overlapping but not identical domains in the lamprey embryonic brain, including a newly-described expression domain in the nasohypophyseal placode. Global alignments of genomic sequences and local alignment with known gnathostome regulatory motifs show that lamprey Hhs share conserved non-coding elements (CNE) with gnathostome Hhs albeit with sequences that have significantly diverged and dispersed. Functional assays using zebrafish embryos demonstrate gnathostome-like midline enhancer activity for CNEs contained in intron2. We conclude that lamprey Hh genes are gnathostome Shh-like in terms of expression and regulation. In addition, they show some lamprey-specific features, including duplication and structural (but not functional) changes in the intronic/regulatory sequences.

  5. Variation in sequence and organization of splicing regulatory elements in vertebrate genes

    PubMed Central

    Yeo, Gene; Hoon, Shawn; Venkatesh, Byrappa; Burge, Christopher B.

    2004-01-01

    Although core mechanisms and machinery of premRNA splicing are conserved from yeast to human, the details of intron recognition often differ, even between closely related organisms. For example, genes from the pufferfish Fugu rubripes generally contain one or more introns that are not properly spliced in mouse cells. Exploiting available genome sequence data, a battery of sequence analysis techniques was used to reach several conclusions about the organization and evolution of splicing regulatory elements in vertebrate genes. The classical splice site and putative branch site signals are completely conserved across the vertebrates studied (human, mouse, pufferfish, and zebrafish), and exonic splicing enhancers also appear broadly conserved in vertebrates. However, another class of splicing regulatory elements, the intronic splicing enhancers, appears to differ substantially between mammals and fish, with G triples (GGG) very abundant in mammalian introns but comparatively rare in fish. Conversely, short repeats of AC and GT are predicted to function as intronic splicing enhancers in fish but are not enriched in mammalian introns. Consistent with this pattern, exonic splicing enhancer-binding SR proteins are highly conserved across all vertebrates, whereas heterogeneous nuclear ribonucleoproteins, which bind many intronic sequences, vary in domain structure and even presence/absence between mammals and fish. Exploiting differences in intronic sequence composition, a statistical model was developed to predict the splicing phenotype of Fugu introns in mammalian systems and was used to engineer the spliceability of a Fugu intron in human cells by insertion of specific sequences, thereby rescuing splicing in human cells. PMID:15505203

  6. Detecting Functional Divergence after Gene Duplication through Evolutionary Changes in Posttranslational Regulatory Sequences

    PubMed Central

    Nguyen Ba, Alex N.; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L.; Landry, Christian R.; Moses, Alan M.

    2014-01-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. PMID:25474245

  7. Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences.

    PubMed

    Nguyen Ba, Alex N; Strome, Bob; Hua, Jun Jie; Desmond, Jonathan; Gagnon-Arsenault, Isabelle; Weiss, Eric L; Landry, Christian R; Moses, Alan M

    2014-12-01

    Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication.

  8. Use of H19 Gene Regulatory Sequences in DNA-Based Therapy for Pancreatic Cancer

    PubMed Central

    Scaiewicz, V.; Sorin, V.; Fellig, Y.; Birman, T.; Mizrahi, A.; Galula, J.; Abu-lail, R.; Shneider, T.; Ohana, P.; Buscail, L.; Hochberg, A.; Czerniak, A.

    2010-01-01

    Pancreatic cancer is the eighth most common cause of death from cancer in the world, for which palliative treatments are not effective and frequently accompanied by severe side effects. We propose a DNA-based therapy for pancreatic cancer using a nonviral vector, expressing the diphtheria toxin A chain under the control of the H19 gene regulatory sequences. The H19 gene is an oncofetal RNA expressed during embryo development and in several types of cancer. We tested the expression of H19 gene in patients, and found that 65% of human pancreatic tumors analyzed showed moderated to strong expression of the gene. In vitro experiments showed that the vector was effective in reducing Luciferase protein activity on pancreatic carcinoma cell lines. In vivo experiment results revealed tumor growth arrest in different animal models for pancreatic cancer. Differences in tumor size between control and treated groups reached a 75% in the heterotopic model (P = .037) and 50% in the orthotopic model (P = .007). In addition, no visible metastases were found in the treated group of the orthotopic model. These results indicate that the treatment with the vector DTA-H19 might be a viable new therapeutic option for patients with unresectable pancreatic cancer. PMID:21052499

  9. Coordinate cytokine regulatory sequences

    DOEpatents

    Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.

    2005-05-10

    The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.

  10. Long-range cooperativity between gene regulatory sequences in a prokaryote.

    PubMed

    Dandanell, G; Valentin-Hansen, P; Larsen, J E; Hammer, K

    Regulation of transcription initiation by proteins binding at DNA sequences some distance from the promoter region itself seems to be a general phenomenon in both eukaryotes and prokaryotes. Proteins bound to an enhancer site in eukaryotes can turn on a distant gene, whereas efficient repression of some prokaryotic genes such as the gal, ara and deo operons of Escherichia coli, requires the presence of two operator sites, separated by 110, 200 and 600 base pairs (bp) respectively. In the deo operon, which encodes nucleoside catabolizing enzymes, we have shown that efficient and cooperative repression can be obtained when the distance between the two sites ranges from 224 to 997 bp. Here, we report that transcription initiation can be regulated from an operator site placed 1 to 5 kilobases (kb) downstream of the deoP2 promoter (and downstream of the transcribed gene), and present the first experimental data for prokaryotic regulation at distances greater than 1 kb. Our results support the model of DNA loop formation as a common regulatory mechanism explaining both some prokaryotic regulation and the action of eukaryotic enhancers.

  11. Inverted duplication of histone genes in chicken and disposition of regulatory sequences.

    PubMed Central

    Wang, S W; Robins, A J; d'Andrea, R; Wells, J R

    1985-01-01

    Sequence analysis of an 8.4 kb fragment containing five chicken histone genes shows that an H4-H2A gene pair is duplicated and inverted around a central H3 gene. A left and right region, each of 2.1 kb are 97% homologous and the boundaries of homology coincide with ten base pair repeats. These boundary regions also contain highly conserved gene promoter elements, suggesting that interaction of transcriptional machinery with histone genes may be connected with recombination in promoter regions, resulting in the inverted duplication structure seen in this cluster. PMID:4000938

  12. A Catalog of Regulatory Sequences for Trait Gene for the Genome Editing of Wheat

    PubMed Central

    Makai, Szabolcs; Tamás, László; Juhász, Angéla

    2016-01-01

    Wheat has been cultivated for 10000 years and ever since the origin of hexaploid wheat it has been exempt from natural selection. Instead, it was under the constant selective pressure of human agriculture from harvest to sowing during every year, producing a vast array of varieties. Wheat has been adopted globally, accumulating variation for genes involved in yield traits, environmental adaptation and resistance. However, one small but important part of the wheat genome has hardly changed: the regulatory regions of both the x- and y-type high molecular weight glutenin subunit (HMW-GS) genes, which are alone responsible for approximately 12% of the grain protein content. The phylogeny of the HMW-GS regulatory regions of the Triticeae demonstrates that a genetic bottleneck may have led to its decreased diversity during domestication and the subsequent cultivation. It has also highlighted the fact that the wild relatives of wheat may offer an unexploited genetic resource for the regulatory region of these genes. Significant research efforts have been made in the public sector and by international agencies, using wild crosses to exploit the available genetic variation, and as a result synthetic hexaploids are now being utilized by a number of breeding companies. However, a newly emerging tool of genome editing provides significantly improved efficiency in exploiting the natural variation in HMW-GS genes and incorporating this into elite cultivars and breeding lines. Recent advancement in the understanding of the regulation of these genes underlines the needs for an overview of the regulatory elements for genome editing purposes. PMID:27766102

  13. A Catalog of Regulatory Sequences for Trait Gene for the Genome Editing of Wheat.

    PubMed

    Makai, Szabolcs; Tamás, László; Juhász, Angéla

    2016-01-01

    Wheat has been cultivated for 10000 years and ever since the origin of hexaploid wheat it has been exempt from natural selection. Instead, it was under the constant selective pressure of human agriculture from harvest to sowing during every year, producing a vast array of varieties. Wheat has been adopted globally, accumulating variation for genes involved in yield traits, environmental adaptation and resistance. However, one small but important part of the wheat genome has hardly changed: the regulatory regions of both the x- and y-type high molecular weight glutenin subunit (HMW-GS) genes, which are alone responsible for approximately 12% of the grain protein content. The phylogeny of the HMW-GS regulatory regions of the Triticeae demonstrates that a genetic bottleneck may have led to its decreased diversity during domestication and the subsequent cultivation. It has also highlighted the fact that the wild relatives of wheat may offer an unexploited genetic resource for the regulatory region of these genes. Significant research efforts have been made in the public sector and by international agencies, using wild crosses to exploit the available genetic variation, and as a result synthetic hexaploids are now being utilized by a number of breeding companies. However, a newly emerging tool of genome editing provides significantly improved efficiency in exploiting the natural variation in HMW-GS genes and incorporating this into elite cultivars and breeding lines. Recent advancement in the understanding of the regulation of these genes underlines the needs for an overview of the regulatory elements for genome editing purposes.

  14. Conservation of position and sequence of a novel, widely expressed gene containing the major human {alpha}-globin regulatory element

    SciTech Connect

    Vyas, P.; Vickers, M.A.; Picketts, D.J.; Higgs, D.R.

    1995-10-10

    We have determined the cDNA and genomic structure of a gene (-14 gene) that lies adjacent to the human {alpha}-globin cluster. Although it is expressed in a wide range of cell lines and tissues, a previously described erythroid-specific regulatory element that controls expression of the {alpha}-globin genes lies within intron 5 of this gene. Analysis of the -14 gene promoter shows that it is GC rich and associated with a constitutively expressed DNase 1 hypersensitive site; unlike the {alpha}-globin promoter, it does not contain a TATA or CCAAT box. These and other differences in promoter structure may explain why the erythroid regulatory element interacts specifically with the {alpha}-globin promoters and not the -14 gene promoter, which lies between the {alpha} promoters and their regulatory element. Interspecies comparisons demonstrate that the sequence and location of the -14 gene adjacent to the a cluster have been maintained since the bird/mammal divergence, 270 million years ago. 38 refs., 6 figs.

  15. Organization of the lexA gene of Escherichia coli and nucleotide sequence of the regulatory region.

    PubMed Central

    Miki, T; Ebina, Y; Kishi, F; Nakazawa, A

    1981-01-01

    The product of the lexA gene of Escherichia coli has been shown to regulate expression of the several cellular functions (SOS functions) induced by treatments which abruptly inhibit DNA synthesis. We have cloned and mapped the lexA gene on a small segment of approximately 600 base pairs. The lexA promotor was located by transcription R-loop analysis, and the lexA product of 22,000 daltons was identified by protein synthesis in vitro. An unknown gene was found which directed the synthesis of a protein of 35,000 daltons in a region downstream from the lexA gene. Nucleotide sequence of the regulatory region of the lexA gene was determined. The sequence contained inverted repeats homologous to that of the recA regulatory region. These inverted repeats may be recognized by the lexA protein, because the protein is considered to repress both the genes as a common repressor. Images PMID:6261224

  16. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  17. Identification of regulatory sequences in the gene for 5-aminolevulinate synthase from rat.

    PubMed

    Braidotti, G; Borthwick, I A; May, B K

    1993-01-15

    The housekeeping enzyme 5-aminolevulinate synthase (ALAS) regulates the supply of heme for respiratory cytochromes. Here we report on the isolation of a genomic clone for the rat ALAS gene. The 5'-flanking region was fused to the chloramphenicol acetyltransferase gene and transient expression analysis revealed the presence of both positive and negative cis-acting sequences. Expression was substantially increased by the inclusion of the first intron located in the 5'-untranslated region. Sequence analysis of the promoter identified two elements at positions -59 and -88 bp with strong similarity to the binding site for nuclear respiratory factor 1 (NRF-1). Gel shift analysis revealed that both NRF-1 elements formed nucleoprotein complexes which could be abolished by an authentic NRF-1 oligomer. Mutagenesis of each NRF-1 motif in the ALAS promoter gave substantially lowered levels of chloramphenicol acetyltransferase expression, whereas mutagenesis of both NRF-1 motifs resulted in the almost complete loss of expression. These results establish that the NRF-1 motifs in the ALAS promoter are critical for promoter activity. NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation. The present studies suggest that NRF-1 may co-ordinate the supply of mitochondrial heme with the synthesis of respiratory cytochromes by regulating expression of ALAS. In erythroid cells, NRF-1 may be less important for controlling heme levels since an erythroid ALAS gene is strongly expressed and the promoter for this gene apparently lacks NRF-1 binding sites.

  18. Population sequencing of two endocannabinoid metabolic genes identifies rare and common regulatory variants associated with extreme obesity and metabolite level

    PubMed Central

    2010-01-01

    Background Targeted re-sequencing of candidate genes in individuals at the extremes of a quantitative phenotype distribution is a method of choice to gain information on the contribution of rare variants to disease susceptibility. The endocannabinoid system mediates signaling in the brain and peripheral tissues involved in the regulation of energy balance, is highly active in obese patients, and represents a strong candidate pathway to examine for genetic association with body mass index (BMI). Results We sequenced two intervals (covering 188 kb) encoding the endocannabinoid metabolic enzymes fatty-acid amide hydrolase (FAAH) and monoglyceride lipase (MGLL) in 147 normal controls and 142 extremely obese cases. After applying quality filters, we called 1,393 high quality single nucleotide variants, 55% of which are rare, and 143 indels. Using single marker tests and collapsed marker tests, we identified four intervals associated with BMI: the FAAH promoter, the MGLL promoter, MGLL intron 2, and MGLL intron 3. Two of these intervals are composed of rare variants and the majority of the associated variants are located in promoter sequences or in predicted transcriptional enhancers, suggesting a regulatory role. The set of rare variants in the FAAH promoter associated with BMI is also associated with increased level of FAAH substrate anandamide, further implicating a functional role in obesity. Conclusions Our study, which is one of the first reports of a sequence-based association study using next-generation sequencing of candidate genes, provides insights into study design and analysis approaches and demonstrates the importance of examining regulatory elements rather than exclusively focusing on exon sequences. PMID:21118518

  19. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  20. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  1. Mutation analysis of TRPS1 gene including core promoter, 5'UTR, and 3'UTR regulatory sequences with insight into their organization.

    PubMed

    Solc, Roman; Klugerova, Michaela; Vcelak, Josef; Baxova, Alice; Kuklik, Miloslav; Vseticka, Jan; Beharka, Rastislav; Hirschfeldova, Katerina

    2017-01-01

    The TRPS1 protein is a potent regulator of proliferation, differentiation, and apoptosis. The TRPS1 gene aberrations are strongly associated with rare trichorhinophalangeal syndrome (TRPS) development. We have conducted MLPA analysis to capture deletion within the crucial 8q24.1 chromosomal region in combination with mutation analysis of TRPS1 gene including core promoter, 5'UTR, and 3'UTR sequences in nine TRPS patients. Low complexity or extent of untranslated regulatory sequences avoided them from analysis in previous studies. Amplicon based next generation sequencing used in our study bridge over these technical limitations. Finally, we have made extended in silico analysis of TRPS1 gene regulatory sequences organization. Single contiguous deletion and an intragenic deletion intervening several exons were detected. Mutation analysis revealed five TRPS1 gene aberrations (two structural rearrangements, two nonsense mutations, and one missense substitution) reaching the overall detection rate of 78%. Several polymorphic variants were detected within the analysed regulatory sequences but without proposed pathogenic effect. In silico analysis suggested alternative promoter usage and diverse expression effectivity for different TRPS1 transcripts. Haploinsufficiency of TRPS1 gene was responsible for most of the TRPS phenotype. Structure of TRPS1 gene regulatory sequences is indicative of generally low single allele expression and its tight control.

  2. Sox2 regulatory region 2 sequence works as a DNA nuclear targeting sequence enhancing the efficiency of an exogenous gene expression in ES cells.

    PubMed

    Funabashi, Hisakage; Takatsu, Makoto; Saito, Mikako; Matsuoka, Hideaki

    2010-10-01

    In this report, the effects of two DNA nuclear targeting sequence (DTS) candidates on the gene expression efficiency in ES cells were investigated. Reporter plasmids containing the simian virus 40 (SV40) promoter/enhancer sequence (SV40-DTS), a DTS for various types of cells but not being reported yet for ES cells, and the 81 base pairs of Sox2 regulatory region 2 (SRR2) where two transcriptional factors in ES cells, Oct3/4 and Sox2, are bound (SRR2-DTS), were introduced into cytoplasm in living cells by femtoinjection. The gene expression efficiencies of each plasmid in mouse insulinoma cell line MIN6 cells and mouse ES cells were then evaluated. Plasmids including SV40-DTS and SRR2-DTS exhibited higher gene expression efficiency comparing to plasmids without these DTSs, and thus it was concluded that both sequences work as a DTS in ES cells. In addition, it was suggested that SRR2-DTS works as an ES cell-specific DTS. To the best of our knowledge, this is the first report to confirm the function of DTSs in ES cells.

  3. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  4. A regulatory sequence from the retinoid X receptor γ gene directs expression to horizontal cells and photoreceptors in the embryonic chicken retina

    PubMed Central

    Blixt, Maria K. E.

    2016-01-01

    Purpose Combining techniques of episomal vector gene-specific Cre expression and genomic integration using the piggyBac transposon system enables studies of gene expression–specific cell lineage tracing in the chicken retina. In this work, we aimed to target the retinal horizontal cell progenitors. Methods A 208 bp gene regulatory sequence from the chicken retinoid X receptor γ gene (RXRγ208) was used to drive Cre expression. RXRγ is expressed in progenitors and photoreceptors during development. The vector was combined with a piggyBac “donor” vector containing a floxed STOP sequence followed by enhanced green fluorescent protein (EGFP), as well as a piggyBac helper vector for efficient integration into the host cell genome. The vectors were introduced into the embryonic chicken retina with in ovo electroporation. Tissue electroporation targets specific developmental time points and in specific structures. Results Cells that drove Cre expression from the regulatory RXRγ208 sequence excised the floxed STOP-sequence and expressed GFP. The approach generated a stable lineage with robust expression of GFP in retinal cells that have activated transcription from the RXRγ208 sequence. Furthermore, GFP was expressed in cells that express horizontal or photoreceptor markers when electroporation was performed between developmental stages 22 and 28. Electroporation of a stage 12 optic cup gave multiple cell types in accordance with RXRγ gene expression in the early retina. Conclusions In this study, we describe an easy, cost-effective, and time-efficient method for testing regulatory sequences in general. More specifically, our results open up the possibility for further studies of the RXRγ-gene regulatory network governing the formation of photoreceptor and horizontal cells. In addition, the method presents approaches to target the expression of effector genes, such as regulators of cell fate or cell cycle progression, to these cells and their progenitor. PMID

  5. The mouse p97 (CDC48) gene. Genomic structure, definition of transcriptional regulatory sequences, gene expression, and characterization of a pseudogene.

    PubMed

    Müller, J M; Meyer, H H; Ruhrberg, C; Stamp, G W; Warren, G; Shima, D T

    1999-04-09

    Here we present the first description of the genomic organization, transcriptional regulatory sequences, and adult and embryonic gene expression for the mouse p97(CDC48) AAA ATPase. Clones representing two distinct p97 genes were isolated in a genomic library screen, one of them likely representing a non-functional processed pseudogene. The coding region of the gene encoding the functional mRNA is interrupted by 16 introns and encompasses 20.4 kilobase pairs. Definition of the transcriptional initiation site and sequence analysis showed that the gene contains a TATA-less, GC-rich promoter region with an initiator element spanning the transcription start site. Cis-acting elements necessary for basal transcription activity reside within 410 base pairs of the flanking region as determined by transient transfection assays. In immunohistological analyses, p97 was widely expressed in embryos and adults, but protein levels were tightly controlled in a cell type- and cell differentiation-dependent manner. A remarkable heterogeneity in p97 immunostaining was found on a cellular level within a given tissue, and protein amounts in the cytoplasm and nucleus varied widely, suggesting a highly regulated and intermittent function for p97. This study provides the basis for a detailed analysis of the complex regulation of p97 and the reagents required for assessing its functional significance using targeted gene manipulation in the mouse.

  6. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model

    PubMed Central

    2014-01-01

    Background Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. Methods We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. Results WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and

  7. Poisson approach to clustering analysis of regulatory sequences.

    PubMed

    Wang, Haiying; Zheng, Huiru; Hu, Jinglu

    2008-01-01

    The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.

  8. Nucleotide sequence analysis reveals linked N-acetyl hydrolase, thioesterase, transport, and regulatory genes encoded by the bialaphos biosynthetic gene cluster of Streptomyces hygroscopicus.

    PubMed Central

    Raibaud, A; Zalacain, M; Holt, T G; Tizard, R; Thompson, C J

    1991-01-01

    Nucleotide sequence analysis of a 5,000-bp region of the bialaphos antibiotic production (bap) gene cluster defined five open reading frames (ORFs) which predicted structural genes in the order bah, ORF1, ORF2, and ORF3 followed by the regulatory gene, brpA (H. Anzai, T. Murakami, S. Imai, A. Satoh, K. Nagaoka, and C.J. Thompson, J. Bacteriol. 169:3482-3488, 1987). The four structural genes were translationally coupled and apparently cotranscribed from an undefined promoter(s) under the positive control of the brpA gene product. S1 mapping experiments indicated that brpA was transcribed by two promoters (brpAp1 and brpAp2) which initiate transcription 150 and 157 bp upstream of brp A within an intergenic region and at least one promoter further upstream within the bap gene cluster (brpAp3). All three transcripts were present at low levels during exponential growth and increased just before the stationary phase. The levels of the brpAp3 band continued to increase at the onset of stationary phase, whereas brpAp1-and brpAp2-protected fragments showed no further change. BrpA contained a possible helix-turn-helix motif at its C terminus which was similar to the C-terminal regulatory motif found in the receiver component of a family of two-component transcriptional activator proteins. This motif was not associated with the N-terminal domain conserved in other members of the family. The structural gene cluster sequenced began with bah, encoding a bialaphos acetylhydrolase which removes the N-acetyl group from bialaphos as one of the final steps in the biosynthetic pathway. The observation that Bah was similar to a rat and to a bacterial (Acinetobacter calcoaceticus) lipase probably reflects the fact that the ester bonds of triglycerides and the amide bond linking acetate to phosphinothricin are similar and hydrolysis is catalyzed by structurally related enzymes. This was followed by two regions encoding ORF1 and ORF2 which were similar to each other (48% nucleotide

  9. [Cloning and function identification of gene 'admA' and up-stream regulatory sequence related to antagonistic activity of Enterobacter cloacae B8].

    PubMed

    Zhu, Jun-Li; Li, De-Bao; Yu, Xu-Ping

    2012-04-01

    To reveal the antagonistic mechanism of B8 strain to Xanthomonas oryzae pv. oryzae, transposon tagging method and chromosome walking were deployed to clone antagonistic related fragments around Tn5 insertion site in the mutant strain B8B. The function of up-stream regulatory sequence of gene 'admA' involved in the antagonistic activity was further identified by gene knocking out technique. An antagonistic related left fragment of Tn5 insertion site, 2 608 bp in length, was obtained by tagging with Kan resistance gene of Tn5. A 2 354 bp right fragment of Tn5 insertion site was amplified with 2 rounds of chromosome walking. The length of the B contig around the Tn5 insertion site was 4 611 bp, containing 7 open reading frames (ORFs). Bioinformatic analysis revealed that these ORFs corresponded to the partial coding regions of glyceraldehyde-3-phosphate dehydrogenase, two LysR family transcriptional regulators, hypothetical protein VSWAT3-20465 of Vibrionales and admA, admB, and partial sequence of admC gene of Pantoea agglomerans biosynthetic gene cluster, respectively. Tn5 was inserted in the up-stream of 200 bp or 894 bp of the sequence corresponding to anrP ORF or admA gene on B8B, respectively. The B-1 and B-2 mutants that lost antagonistic activity were selected by homeologuous recombination technology in association with knocking out plasmid pMB-BG. These results suggested that the transcription and expression of anrP gene might be disrupted as a result of the knocking out of up-stream regulatory sequence by Tn5 in B8B strain, further causing biosythesis regulation of the antagonistic related gene cluster. Thus, the antagonistic related genes in B8 strain is a gene family similar as andrimid biosynthetic gene cluster, and the upstream regulatory region appears to be critical for the antibiotics biosynthesis.

  10. Plant nitrogen regulatory P-PII genes

    DOEpatents

    Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun

    2001-01-01

    The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the

  11. p38 MAPK down-regulates fibulin 3 expression through methylation of gene regulatory sequences: role in migration and invasion.

    PubMed

    Arechederra, María; Priego, Neibla; Vázquez-Carballo, Ana; Sequera, Celia; Gutiérrez-Uzquiza, Álvaro; Cerezo-Guisado, María Isabel; Ortiz-Rivero, Sara; Roncero, Cesáreo; Cuenda, Ana; Guerrero, Carmen; Porras, Almudena

    2015-02-13

    p38 MAPKs regulate migration and invasion. However, the mechanisms involved are only partially known. We had previously identified fibulin 3, which plays a role in migration, invasion, and tumorigenesis, as a gene regulated by p38α. We have characterized in detail how p38 MAPK regulates fibulin 3 expression and its role. We describe here for the first time that p38α, p38γ, and p38δ down-regulate fibulin 3 expression. p38α has a stronger effect, and it does so through hypermethylation of CpG sites in the regulatory sequences of the gene. This would be mediated by the DNA methylase, DNMT3A, which is down-regulated in cells lacking p38α, but once re-introduced represses Fibulin 3 expression. p38α through HuR stabilizes dnmt3a mRNA leading to an increase in DNMT3A protein levels. Moreover, by knocking-down fibulin 3, we have found that Fibulin 3 inhibits migration and invasion in MEFs by mechanisms involving p38α/β inhibition. Hence, p38α pro-migratory/invasive effect might be, at least in part, mediated by fibulin 3 down-regulation in MEFs. In contrast, in HCT116 cells, Fibulin 3 promotes migration and invasion through a mechanism dependent on p38α and/or p38β activation. Furthermore, Fibulin 3 promotes in vitro and in vivo tumor growth of HCT116 cells through a mechanism dependent on p38α, which surprisingly acts as a potent inducer of tumor growth. At the same time, p38α limits fibulin 3 expression, which might represent a negative feed-back loop.

  12. The upstream regulatory sequence of the light harvesting complex Lhcf2 gene of the marine diatom Phaeodactylum tricornutum enhances transcription in an orientation- and distance-independent fashion.

    PubMed

    Russo, Monia Teresa; Annunziata, Rossella; Sanges, Remo; Ferrante, Maria Immacolata; Falciatore, Angela

    2015-12-01

    Diatoms are a key phytoplankton group in the contemporary ocean, showing extraordinary adaptation capacities to rapidly changing environments. The recent availability of whole genome sequences from representative species has revealed distinct features in their genomes, like novel combinations of genes encoding distinct metabolisms and a significant number of diatom-specific genes. However, the regulatory mechanisms driving diatom gene expression are still largely uncharacterized. Considering the wide variety of fields of study orbiting diatoms, ranging from ecology, evolutionary biology to biotechnology, it is thus essential to increase our understanding of fundamental gene regulatory processes such as transcriptional regulation. To this aim, we explored the functional properties of the 5'-flanking region of the Phaeodatylum tricornutum Lhcf2 gene, encoding a member of the Light Harvesting Complex superfamily and we showed that this region enhances transcription of a GUS reporter gene in an orientation- and distance-independent fashion. This represents the first example of a cis-regulatory sequence with enhancer-like features discovered in diatoms and it is instrumental for the generation of novel genetic tools and diatom exploitation in different areas of study.

  13. Two distinct nuclear factors bind the conserved regulatory sequences of a rabbit major histocompatibility complex class II gene.

    PubMed Central

    Sittisombut, N

    1988-01-01

    The constitutive coexpression of the major histocompatibility complex (MHC) class II genes in B lymphocytes requires positive, trans-acting transcriptional factors. The need for these trans-acting factors has been suggested by the reversion of the MHC class II-negative phenotype of rare B-lymphocyte mutants through somatic cell fusion with B cells or T-cell lines. The mechanism by which the trans-acting factors exert their effect on gene transcription is unknown. The possibility that two highly conserved DNA sequences, located 90 to 100 base pairs (bp) (the A sequence) and 60 to 70 bp (the B sequence) upstream of the transcription start site of the class II genes, are recognized by the trans-acting factors was investigated in this study. By using the gel electrophoresis retardation assay, a minimum of two proteins which specifically bound the conserved A or B sequence of a rabbit DP beta gene were identified in murine nuclear extracts of a B-lymphoma cell line, A20-2J. Fractionation of nuclear extract through a heparin-agarose column allowed the identification of one protein, designated NF-MHCIIB, which bound an oligonucleotide containing the B sequence and protected the entire B sequence in the DNase I protection analysis. Another protein, designated NF-MHCIIA, which bound an oligonucleotide containing the A sequence and partially protected the 3' half of this sequence, was also identified. NF-MHCIIB did not protect a CCAAT sequence located 17 bp downstream of the B sequence. The possible relationship between these DNA-binding factors and the trans-acting factors identified in the cell fusion experiments is discussed. Images PMID:3133552

  14. Identification of an upstream regulatory sequence that mediates the transcription of mox genes in Methylobacterium extorquens AM1.

    PubMed

    Zhang, Meng; FitzGerald, Kelly A; Lidstrom, Mary E

    2005-11-01

    A multiple A-tract sequence has been identified in the promoter regions for the mxaF, pqqA, mxaW, mxbD and mxcQ genes involved in methanol oxidation in Methylobacterium extorquens AM1, a facultative methylotroph. Site-directed mutagenesis was exploited to delete or change this conserved sequence. Promoter-xylE transcriptional fusions were used to assess promoter activity in these mutants. A fiftyfold drop in the XylE activity was observed for the mxaF and pqqA promoters without this sequence, and a five- to sixfold drop in the XylE activity was observed for the mxbD and mxcQ promoters without this sequence. Mutants were generated in the chromosomal copies in which this sequence was either deleted or altered, and these mutants were unable to grow on methanol. When one of these sequences was added to Plac of Escherichia coli, which is a weak constitutive promoter in M. extorquens AM1, the activity increased two- to threefold. These results suggest that this sequence is essential for normal expression of these genes in M. extorquens AM1, and may serve as a general enhancer element for genetic constructs in this bacterium.

  15. DNA sequence of Rhizobium trifolii nodulation genes reveals a reiterated and potentially regulatory sequence preceding nodABC and nodFE.

    PubMed Central

    Schofield, P R; Watson, J M

    1986-01-01

    The Rhizobium trifolii nod genes required for host-specific nodulation of clovers are located on 14 kb of Sym (symbiotic) plasmid DNA. Analysis of the nucleotide sequence of a 3.7 kb portion of this region has revealed open reading frames corresponding to the nodABCDEF genes. A DNA sequencing technique, using primer extension from within Tn5, has been used to determine the precise locations of Tn5 mutations within the nod genes and the phenotypes of the corresponding mutants correlate with their mapped locations. The predicted nodA and nodB genes overlap by four nucleotides and the nod F and nodE genes overlap by a single nucleotide, suggesting that translational coupling may ensure the synthesis of equimolar amounts of these gene products. The nodABC and nodFE genes constitute separate transcriptional units and each is preceded by a conserved 76-bp sequence which may be involved in the regulation of expression of these genes. Images PMID:3008100

  16. Structural analysis of the regulatory elements of the type-II procollagen gene. Conservation of promoter and first intron sequences between human and mouse.

    PubMed Central

    Vikkula, M; Metsäranta, M; Syvänen, A C; Ala-Kokko, L; Vuorio, E; Peltonen, L

    1992-01-01

    Transcription of the type-II procollagen gene (COL2A1) is very specifically restricted to a limited number of tissues, particularly cartilages. In order to identify transcription-control motifs we have sequenced the promoter region and the first intron of the human and mouse COL2A1 genes. With the assumption that these motifs should be well conserved during evolution, we have searched for potential elements important for the tissue-specific transcription of the COL2A1 gene by aligning the two sequences with each other and with the available rat type-II procollagen sequence for the promoter. With this approach we could identify specific evolutionarily well-conserved motifs in the promoter area. On the other hand, several suggested regulatory elements in the promoter region did not show evolutionary conservation. In the middle of the first intron we found a cluster of well-conserved transcription-control elements and we conclude that these conserved motifs most probably possess a significant function in the control of the tissue-specific transcription of the COL2A1 gene. We also describe locations of additional, highly conserved nucleotide stretches, which are good candidate regions in the search for binding sites of yet-uncharacterized cartilage-specific transcription regulators of the COL2A1 gene. PMID:1637314

  17. Identifying Distal cis-acting Gene-Regulatory Sequences by Expressing BACs Functionalized with loxP-Tn10 Transposons in Zebrafish.

    PubMed

    Chatterjee, Pradeep K; Shakes, Leighcraft A; Wolf, Hope M; Mujalled, Mohammad A; Zhou, Constance; Hatcher, Charles; Norford, Derek C

    2013-06-21

    Bacterial Artificial Chromosomes (BACs) are large pieces of DNA from the chromosomes of organisms propagated faithfully in bacteria as large extra-chromosomal plasmids. Expression of genes contained in BACs can be monitored after functionalizing the BAC DNA with reporter genes and other sequences that allow stable maintenance and propagation of the DNA in the new host organism. The DNA in BACs can be altered within its bacterial host in several ways. Here we discuss one such approach, using Tn10 mini-transposons, to introduce exogenous sequences into BACs for a variety of purposes. The largely random insertions of Tn10 transposons carrying lox sites have been used to position mammalian cell-selectable antibiotic resistance genes, enhancer-traps and inverted repeat ends of the vertebrate transposon Tol2 precisely at the ends of the genomic DNA insert in BACs. These modified BACs are suitable for expression in zebrafish or mouse, and have been used to functionally identify important long-range gene regulatory sequences in both species. Enhancer-trapping using BACs should prove uniquely useful in analyzing multiple discontinuous DNA domains that act in concert to regulate expression of a gene, and is not limited by genome accessibility issues of traditional enhancer-trapping methods.

  18. Comparisons of Ribosomal Protein Gene Promoters Indicate Superiority of Heterologous Regulatory Sequences for Expressing Transgenes in Phytophthora infestans.

    PubMed

    Poidevin, Laetitia; Andreeva, Kalina; Khachatoorian, Careen; Judelson, Howard S

    2015-01-01

    Molecular genetics approaches in Phytophthora research can be hampered by the limited number of known constitutive promoters for expressing transgenes and the instability of transgene activity. We have therefore characterized genes encoding the cytoplasmic ribosomal proteins of Phytophthora and studied their suitability for expressing transgenes in P. infestans. Phytophthora spp. encode a standard complement of 79 cytoplasmic ribosomal proteins. Several genes are duplicated, and two appear to be pseudogenes. Half of the genes are expressed at similar levels during all stages of asexual development, and we discovered that the majority share a novel promoter motif named the PhRiboBox. This sequence is enriched in genes associated with transcription, translation, and DNA replication, including tRNA and rRNA biogenesis. Promoters from the three P. infestans genes encoding ribosomal proteins S9, L10, and L23 and their orthologs from P. capsici were tested for their ability to drive transgenes in stable transformants of P. infestans. Five of the six promoters yielded strong expression of a GUS reporter, but the stability of expression was higher using the P. capsici promoters. With the RPS9 and RPL10 promoters of P. infestans, about half of transformants stopped making GUS over two years of culture, while their P. capsici orthologs conferred stable expression. Since cross-talk between native and transgene loci may trigger gene silencing, we encourage the use of heterologous promoters in transformation studies.

  19. Developmental appearance of factors that bind specifically to cis-regulatory sequences of a gene expressed in the sea urchin embryo.

    PubMed

    Calzone, F J; Thézé, N; Thiebaud, P; Hill, R L; Britten, R J; Davidson, E H

    1988-09-01

    Previous gene-transfer experiments have identified a 2500-nucleotide 5' domain of the CyIIIa cytoskeletal actin gene, which contains cis-regulatory sequences that are necessary and sufficient for spatial and temporal control of CyIIIa gene expression during embryogenesis. This gene is activated in late cleavage, exclusively in aboral ectoderm cell lineages. In this study, we focus on interactions demonstrated in vitro between sequences of the regulatory domain and proteins present in crude extracts derived from sea urchin embryo nuclei and from unfertilized eggs. Quantitative gel-shift measurements are utilized to estimate minimum numbers of factor molecules per embryo at 24 hr postfertilization, when the CyIIIa gene is active, at 7 hr, when it is still silent, and in the unfertilized egg. We also estimate the binding affinity preferences (Kr) of the various factors for their respective sites, relative to their affinity for synthetic DNA competitors. At least 14 different specific interactions occur within the regulatory regions, some of which produce multiple DNA-protein complexes. Values of Kr range from approximately 2 x 10(4) to approximately 2 x 10(6) for these factors under the conditions applied. With one exception, the minimum factor prevalences that we measured in the 400-cell 24-hr embryo nuclear extracts fell within the range of 2 x 10(5) to 2 x 10(6) molecules per embryo, i.e., a few hundred to a few thousand molecules per nucleus. Three developmental patterns were observed with respect to factor prevalence: Factors reacting at one site were found in unfertilized egg cytoplasm at about the same level per egg or embryo as in 24-hr embryo nuclei; factors reacting with five other regions of the regulatory domain are not detectable in egg cytoplasm but in 7-hr mid-cleavage-stage embryo, nuclei are already at or close to their concentrations in the 24-hr embryo nuclei; and factors reacting with five additional regions are not detectable in egg cytoplasm and

  20. Analysis of sequences involved in IE2 transactivation of a baculovirus immediate-early gene promoter and identification of a new regulatory motif.

    PubMed

    Shippam-Brett, C E; Willis, L G; Theilmann, D A

    2001-05-01

    Opep-2 is a unique baculovirus early gene that has only been identified in the Orgyia pseudotsugata multiple capsid nucleopolyhedrovirus (OpMNPV). Previous analyses have shown this gene is expressed at very early times post-infection (p.i.) but is shut down by 36-48 h p.i. The promoter of opep-2 therefore, represents a class of early genes that is temporally regulated. In this study, a detailed analysis of the opep-2 promoter is performed to analyze the role individual motifs play in early gene expression. A new 13 base pair regulatory element was identified and shown to be essential in controlling high-level expression of this gene. In addition, mutational analysis revealed that GATA and CACGTG motifs, which have been shown to bind cellular factors in Sf9 and Ld652Y cells, played minor roles in influencing opep-2 expression in the absence of other viral factors. The OpMNPV transactivator IE2 causes a significant activation of the opep-2 promoter. Cotransfection of an extensive number of promoter deletions and mutations did not show any sequence specificity for IE2 transactivation. This is the first detailed analysis of the sequence requirements for IE2 transactivation, and these results suggest that IE2 does not bind directly to specific elements in the opep-2 promoter.

  1. A distinct regulatory sequence is essential for the expression of a subset of nle genes in attaching and effacing Escherichia coli.

    PubMed

    García-Angulo, Víctor A; Martínez-Santos, Verónica I; Villaseñor, Tomás; Santana, Francisco J; Huerta-Saquero, Alejandro; Martínez, Luary C; Jiménez, Rafael; Lara-Ochoa, Cristina; Téllez-Sosa, Juan; Bustamante, Víctor H; Puente, José L

    2012-10-01

    Enteropathogenic Escherichia coli uses a type III secretion system (T3SS), encoded in the locus of enterocyte effacement (LEE) pathogenicity island, to translocate a wide repertoire of effector proteins into the host cell in order to subvert cell signaling cascades and promote bacterial colonization and survival. Genes encoding type III-secreted effectors are located in the LEE and scattered throughout the chromosome. While LEE gene regulation is better understood, the conditions and factors involved in the expression of effectors encoded outside the LEE are just starting to be elucidated. Here, we identified a highly conserved sequence containing a 13-bp inverted repeat (IR), located upstream of a subset of genes coding for different non-LEE-encoded effectors in A/E pathogens. Site-directed mutagenesis and deletion analysis of the nleH1 and nleB2 regulatory regions revealed that this IR is essential for the transcriptional activation of both genes. Growth conditions that favor the expression of LEE genes also facilitate the activation of nleH1 and nleB2; however, their expression is independent of the LEE-encoded positive regulators Ler and GrlA but is repressed by GrlR and the global regulator H-NS. In contrast, GrlA and Ler are required for nleA expression, while H-NS silences it. Consistent with their role in the regulation of nleA, purified Ler and H-NS bound to the regulatory region of nleA upstream of its promoter. This work shows that at least two modes of regulation control the expression of effector genes in attaching and effacing (A/E) pathogens, suggesting that a subset of effector functions may be coordinately expressed in a particular niche or time during infection.

  2. A Distinct Regulatory Sequence Is Essential for the Expression of a Subset of nle Genes in Attaching and Effacing Escherichia coli

    PubMed Central

    García-Angulo, Víctor A.; Martínez-Santos, Verónica I.; Villaseñor, Tomás; Santana, Francisco J.; Huerta-Saquero, Alejandro; Martínez, Luary C.; Jiménez, Rafael; Lara-Ochoa, Cristina; Téllez-Sosa, Juan; Bustamante, Víctor H.

    2012-01-01

    Enteropathogenic Escherichia coli uses a type III secretion system (T3SS), encoded in the locus of enterocyte effacement (LEE) pathogenicity island, to translocate a wide repertoire of effector proteins into the host cell in order to subvert cell signaling cascades and promote bacterial colonization and survival. Genes encoding type III-secreted effectors are located in the LEE and scattered throughout the chromosome. While LEE gene regulation is better understood, the conditions and factors involved in the expression of effectors encoded outside the LEE are just starting to be elucidated. Here, we identified a highly conserved sequence containing a 13-bp inverted repeat (IR), located upstream of a subset of genes coding for different non-LEE-encoded effectors in A/E pathogens. Site-directed mutagenesis and deletion analysis of the nleH1 and nleB2 regulatory regions revealed that this IR is essential for the transcriptional activation of both genes. Growth conditions that favor the expression of LEE genes also facilitate the activation of nleH1 and nleB2; however, their expression is independent of the LEE-encoded positive regulators Ler and GrlA but is repressed by GrlR and the global regulator H-NS. In contrast, GrlA and Ler are required for nleA expression, while H-NS silences it. Consistent with their role in the regulation of nleA, purified Ler and H-NS bound to the regulatory region of nleA upstream of its promoter. This work shows that at least two modes of regulation control the expression of effector genes in attaching and effacing (A/E) pathogens, suggesting that a subset of effector functions may be coordinately expressed in a particular niche or time during infection. PMID:22904277

  3. [Gene and gene sequence patenting].

    PubMed

    Bergel, S D

    1998-01-01

    According to the author, the patenting of elements isolated or copied from the human body boils down to the issue of genes and gene sequences. He describes the current situation from the comparative law standpoint (U.S. and Spanish law mainly) and then esamines the biotechnology industry's position.

  4. A site-specific, single-copy transgenesis strategy to identify 5' regulatory sequences of the mouse testis-determining gene Sry.

    PubMed

    Quinn, Alexander; Kashimada, Kenichi; Davidson, Tara-Lynne; Ng, Ee Ting; Chawengsaksophak, Kallayanee; Bowles, Josephine; Koopman, Peter

    2014-01-01

    The Y-chromosomal gene SRY acts as the primary trigger for male sex determination in mammalian embryos. Correct regulation of SRY is critical: aberrant timing or level of Sry expression is known to disrupt testis development in mice and we hypothesize that mutations that affect regulation of human SRY may account for some of the many cases of XY gonadal dysgenesis that currently remain unexplained. However, the cis-sequences involved in regulation of Sry have not been identified, precluding a test of this hypothesis. Here, we used a transgenic mouse approach aimed at identifying mouse Sry 5' flanking regulatory sequences within 8 kb of the Sry transcription start site (TSS). To avoid problems associated with conventional pronuclear injection of transgenes, we used a published strategy designed to yield single-copy transgene integration at a defined, transcriptionally open, autosomal locus, Col1a1. None of the Sry transgenes tested was expressed at levels compatible with activation of Sox9 or XX sex reversal. Our findings indicate either that the Col1a1 locus does not provide an appropriate context for the correct expression of Sry transgenes, or that the cis-sequences required for Sry expression in the developing gonads lie beyond 8 kb 5' of the TSS.

  5. Activation of the major immediate early gene of human cytomegalovirus by cis-acting elements in the promoter-regulatory sequence and by virus-specific trans-acting components.

    PubMed Central

    Stinski, M F; Roehr, T J

    1985-01-01

    Upstream of the major immediate early gene of human cytomegalovirus (Towne) is a strong promoter-regulatory region that promotes the synthesis of 1.95-kilobase mRNA (D. R. Thomsen, R. M. Stenberg, W. F. Goins, and M. F. Stinski, Proc. Natl. Acad. Sci. U.S.A. 81:659-663, 1984; M. F. Stinski, D. R. Thomsen, R. M. Stenberg, and L. C. Goldstein, J. Virol. 46:1-14, 1983). The wild-type promoter-regulatory region as well as deletions within this region were ligated upstream of the thymidine kinase, chloramphenicol acetyltransferase, or ovalbumin genes. These gene chimeras were constructed to investigate the role of the regulatory sequences in enhancing downstream expression. The regulatory region extends to approximately 465 nucleotides upstream of the cap site for the initiation of transcription. The extent and type of regulatory sequences upstream of the promoter influences the level of in vitro transcription as well as the amount of in vivo expression of the downstream gene. The regulatory elements for cis-activation appear to be repeated several times within the regulatory region. A direct correlation was established between the distribution of the 19 (5' CCCCAGTTGACGTCAATGGG 3')- and 18 (5' CACTAACGGGACTTTCCAA 3')-nucleotide repeats and the level of downstream expression. In contrast, the 16 (5' CTTGGCAGTACATCAA 3')-nucleotide repeat is not necessary for the enhancement of downstream expression. In a domain associated with the 19- or 18-nucleotide repeats are elements that can be activated in trans by a human cytomegalovirus-specified component but not a herpes simplex virus-specified component. Therefore, the regulatory sequences of the major immediate early gene of human cytomegalovirus have an important role in interacting with cellular and virus-specific factors of the transcription complex to enhance downstream expression of this critical viral gene. Images PMID:2991567

  6. Comparison of loline alkaloid gene clusters across fungal endophytes: predicting the co-regulatory sequence motifs and the evolutionary history.

    PubMed

    Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H

    2007-10-01

    LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.

  7. Computational identification of transcriptional regulatory elements in DNA sequence

    PubMed Central

    GuhaThakurta, Debraj

    2006-01-01

    Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges. PMID:16855295

  8. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  9. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  10. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks.

  11. Analysis of transcriptional and upstream regulatory sequence activity of two environmental stress-inducible genes, NBS-Str1 and BLEC-Str8, of rice.

    PubMed

    Ray, Swatismita; Kapoor, Sanjay; Tyagi, Akhilesh K

    2012-04-01

    Two abiotic stress-inducible upstream regulatory sequences (URSs) from rice have been identified and functionally characterized in rice. NBS-Str1 and BLEC-Str8 genes have been identified, by analysing the transcriptome data of cold, salt and desiccation stress-treated 7-day-old rice (Oryza sativa L. var. IR64) seedling, to be preferentially responsive to desiccation and salt stress, respectively. NBS-Str1 and BLEC-Str8 genes code for putative NBS (nucleotide binding site)-LRR (leucine rich repeat) and β-lectin domain protein, respectively. NBS-Str1 URS is induced in root tissue, preferentially in vascular bundle, during 3 and 24 h of desiccation stress condition in transgenic 7-day-old rice seedling. In mature transgenic plants, this URS shows induction in root and shoot tissue under desiccation stress as well as under prolonged (1 and 2 day) salt stress. BLEC-Str8 URS shows basal activity under un-stressed condition, however, it is inducible under salt stress condition in both root and leaf tissues in young seedling and mature plants. Activity of BLEC-Str8 URS has been found to be vascular tissue preferential, however, under salt stress condition its activity is also found in the mesophyll tissue. NBS-Str1 and BLEC-Str8 URSs are inducible by heavy metal, copper and manganese. Interestingly, both the URSs have been found to be non responsive to ABA treatment, implying them to be part of ABA-independent abiotic stress response pathway. These URSs could prove useful for expressing a transgene in a stress responsive manner for development of stress tolerant transgenic systems.

  12. Pleiotropy constrains the evolution of protein but not regulatory sequences in a transcription regulatory network influencing complex social behaviors

    PubMed Central

    Molodtsova, Daria; Harpur, Brock A.; Kent, Clement F.; Seevananthan, Kajendra; Zayed, Amro

    2014-01-01

    It is increasingly apparent that genes and networks that influence complex behavior are evolutionary conserved, which is paradoxical considering that behavior is labile over evolutionary timescales. How does adaptive change in behavior arise if behavior is controlled by conserved, pleiotropic, and likely evolutionary constrained genes? Pleiotropy and connectedness are known to constrain the general rate of protein evolution, prompting some to suggest that the evolution of complex traits, including behavior, is fuelled by regulatory sequence evolution. However, we seldom have data on the strength of selection on mutations in coding and regulatory sequences, and this hinders our ability to study how pleiotropy influences coding and regulatory sequence evolution. Here we use population genomics to estimate the strength of selection on coding and regulatory mutations for a transcriptional regulatory network that influences complex behavior of honey bees. We found that replacement mutations in highly connected transcription factors and target genes experience significantly stronger negative selection relative to weakly connected transcription factors and targets. Adaptively evolving proteins were significantly more likely to reside at the periphery of the regulatory network, while proteins with signs of negative selection were near the core of the network. Interestingly, connectedness and network structure had minimal influence on the strength of selection on putative regulatory sequences for both transcription factors and their targets. Our study indicates that adaptive evolution of complex behavior can arise because of positive selection on protein-coding mutations in peripheral genes, and on regulatory sequence mutations in both transcription factors and their targets throughout the network. PMID:25566318

  13. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  14. Evolutionary conservation of regulatory elements in vertebrate Hox gene clusters.

    PubMed

    Santini, Simona; Boore, Jeffrey L; Meyer, Axel

    2003-06-01

    Comparisons of DNA sequences among evolutionarily distantly related genomes permit identification of conserved functional regions in noncoding DNA. Hox genes are highly conserved in vertebrates, occur in clusters, and are uninterrupted by other genes. We aligned (PipMaker) the nucleotide sequences of the HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human, and mouse, which are separated by approximately 500 million years of evolution. In support of our approach, several identified putative regulatory elements known to regulate the expression of Hox genes were recovered. The majority of the newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac database). The regulatory intergenic regions located between the genes that are expressed most anteriorly in the embryo are longer and apparently more evolutionarily conserved than those at the other end of Hox clusters. Different presumed regulatory sequences are retained in either the Aalpha or Abeta duplicated Hox clusters in the fish lineages. This suggests that the conserved elements are involved in different gene regulatory networks and supports the duplication-deletion-complementation model of functional divergence of duplicated genes.

  15. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    SciTech Connect

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  16. Expression of the human granulocyte-macrophage colony stimulating factor (hGM-CSF) gene under control of the 5'-regulatory sequence of the goat alpha-S1-casein gene with and without a MAR element in transgenic mice.

    PubMed

    Burkov, I A; Serova, I A; Battulin, N R; Smirnov, A V; Babkin, I V; Andreeva, L E; Dvoryanchikov, G A; Serov, O L

    2013-10-01

    Expression of the human granulocyte-macrophage colony-stimulating factor (hGM-CSF) gene under the control of the 5'-regulatory sequence of the goat alpha-S1-casein gene with and without a matrix attachment region (MAR) element from the Drosophila histone 1 gene was studied in four and eight transgenic mouse lines, respectively. Of the four transgenic lines carrying the transgene without MAR, three had correct tissues-specific expression of the hGM-CSF gene in the mammary gland only and no signs of cell mosaicism. The concentration of hGM-CSF in the milk of transgenic females varied from 1.9 to 14 μg/ml. One line presented hGM-CSF in the blood serum, indicating ectopic expression. The values of secretion of hGM-CSF in milk of 6 transgenic lines carrying the transgene with MAR varied from 0.05 to 0.7 μg/ml, and two of these did not express hGM-CSF. Three of the four examined animals from lines of this group showed ectopic expression of the hGM-CSF gene, as determined by RT-PCR and immunofluorescence analyses, as well as the presence of hGM-CSF in the blood serum. Mosaic expression of the hGM-CSF gene in mammary epithelial cells was specific to all examined transgenic mice carrying the transgene with MAR but was never observed in the transgenic mice without MAR. The mosaic expression was not dependent on transgene copy number. Thus, the expected "protective or enhancer effect" from the MAR element on the hGM-CSF gene expression was not observed.

  17. An internal regulatory element controls troponin I gene expression

    SciTech Connect

    Yutzey, K.E.; Kline, R.L.; Konieczmy, S.F. . Dept. of Biological Sciences)

    1989-04-01

    During skeletal myogenesis, approximately 20 contractile proteins and related gene products temporally accumulate as the cells fuse to form multinucleated muscle fibers. In most instances, the contractile protein genes are regulated transcriptionally, which suggests that a common molecular mechanism may coordinate the expression of this diverse and evolutionarily unrelated gene set. Recent studies have examined the muscle-specific cis-acting elements associated with numerous contractile protein genes. All of the identified regulatory elements are positioned in the 5'-flanking regions, usually within 1,500 base pairs of the transcription start site. Surprisingly, a DNA consensus sequence that is common to each contractile protein gene has not been identified. In contrast to the results of these earlier studies, the authors have found that the 5'-flanking region of the quail troponin I (TnI) gene is not sufficient to permit the normal myofiber transcriptional activation of the gene. Instead, the TnI gene utilizes a unique internal regulatory element that is responsible for the correct myofiber-specific expression pattern associated with the TnI gene. This is the first example in which a contractile protein gene has been shown to rely primarily on an internal regulatory element to elicit transcriptional activation during myogenesis. The diversity of regulatory elements associated with the contractile protein genes suggests that the temporal expression of the genes may involve individual cis-trans regulatory components specific for each gene.

  18. Sequence and expression of GLN3, a positive nitrogen regulatory gene of Saccharomyces cerevisiae encoding a protein with a putative zinc finger DNA-binding domain.

    PubMed Central

    Minehart, P L; Magasanik, B

    1991-01-01

    The GLN3 gene of Saccharomyces cerevisiae is required for the activation of transcription of a number of genes in response to the replacement of glutamine by glutamate as source of nitrogen. We cloned the GLN3 gene and constructed null alleles by gene disruption. GLN3 is not essential for growth, but increased copies of GLN3 lead to a drastic decrease in growth rate. The complete nucleotide sequence of the GLN3 gene was determined, revealing one open reading frame encoding a polypeptide of 730 amino acids, with a molecular weight of approximately 80,000. The GLN3 protein contains a single putative Cys2/Cys2 zinc finger which has homology to the Neurospora crassa NIT2 protein, the Aspergillus nidulans AREA protein, and the erythroid-specific transcription factor GATA-1. Immunoprecipitation experiments indicated that the GLN3 protein binds the nitrogen upstream activation sequence of GLN1, the gene encoding glutamine synthetase. Neither control of transcription nor control of initiation of translation of GLN3 is important for regulation in response to glutamine availability. Images PMID:1682800

  19. Regulatory genes in the ancestral chordate genomes.

    PubMed

    Satou, Yutaka; Wada, Shuichi; Sasakura, Yasunori; Satoh, Nori

    2008-12-01

    Changes or innovations in gene regulatory networks for the developmental program in the ancestral chordate genome appear to be a major component in the evolutionary process in which tadpole-type larvae, a unique characteristic of chordates, arose. These alterations may include new genetic interactions as well as the acquisition of new regulatory genes. Previous analyses of the Ciona genome revealed that many genes may have emerged after the divergence of the tunicate and vertebrate lineages. In this paper, we examined this possibility by examining a second non-vertebrate chordate genome. We conclude from this analysis that the ancient chordate included almost the same repertory of regulatory genes, but less redundancy than extant vertebrates, and that approximately 10% of vertebrate regulatory genes were innovated after the emergence of vertebrates. Thus, refined regulatory networks arose during vertebrate evolution mainly as preexisting regulatory genes multiplied rather than by generating new regulatory genes. The inferred regulatory gene sets of the ancestral chordate would be an important foundation for understanding how tadpole-type larvae, a unique characteristic of chordates, evolved.

  20. A Rhizobium meliloti symbiotic regulatory gene.

    PubMed

    Szeto, W W; Zimmerman, J L; Sundaresan, V; Ausubel, F M

    1984-04-01

    We have characterized a Rhizobium meliloti regulatory gene required for the expression of two closely linked symbiotic operons, the nitrogenase operon (nifHDK genes) and the "P2" operon. This regulatory gene maps to a 1.8 kb region located 5.5 kb upstream of the nifHDK operon. The regulatory gene is required for the accumulation of nifHDK and P2 mRNA and for the derepression of an R. meliloti nifH-lacZ fusion plasmid during symbiotic growth. The nifH and P2 promoters can be activated in free-living cultures of R. meliloti containing plasmids that produce the Escherichia coli ntrC(glnG) or the Klebsiella pneumoniae nifA regulatory gene products constitutively. The R. meliloti regulatory gene hybridizes to E. coli ntrC(glnG) and, to a lesser extent, to K. pneumoniae nifA DNA. Our results suggest that the R. meliloti regulatory gene acts as a positive transcriptional activator and that it is related to the K. pneumoniae nif regulatory genes.

  1. Modeling of hysteresis in gene regulatory networks.

    PubMed

    Hu, J; Qin, K R; Xiang, C; Lee, T H

    2012-08-01

    Hysteresis, observed in many gene regulatory networks, has a pivotal impact on biological systems, which enhances the robustness of cell functions. In this paper, a general model is proposed to describe the hysteretic gene regulatory network by combining the hysteresis component and the transient dynamics. The Bouc-Wen hysteresis model is modified to describe the hysteresis component in the mammalian gene regulatory networks. Rigorous mathematical analysis on the dynamical properties of the model is presented to ensure the bounded-input-bounded-output (BIBO) stability and demonstrates that the original Bouc-Wen model can only generate a clockwise hysteresis loop while the modified model can describe both clockwise and counter clockwise hysteresis loops. Simulation studies have shown that the hysteresis loops from our model are consistent with the experimental observations in three mammalian gene regulatory networks and two E.coli gene regulatory networks, which demonstrate the ability and accuracy of the mathematical model to emulate natural gene expression behavior with hysteresis. A comparison study has also been conducted to show that this model fits the experiment data significantly better than previous ones in the literature. The successful modeling of the hysteresis in all the five hysteretic gene regulatory networks suggests that the new model has the potential to be a unified framework for modeling hysteresis in gene regulatory networks and provide better understanding of the general mechanism that drives the hysteretic function.

  2. Evolving Robust Gene Regulatory Networks

    PubMed Central

    Noman, Nasimul; Monjo, Taku; Moscato, Pablo; Iba, Hitoshi

    2015-01-01

    Design and implementation of robust network modules is essential for construction of complex biological systems through hierarchical assembly of ‘parts’ and ‘devices’. The robustness of gene regulatory networks (GRNs) is ascribed chiefly to the underlying topology. The automatic designing capability of GRN topology that can exhibit robust behavior can dramatically change the current practice in synthetic biology. A recent study shows that Darwinian evolution can gradually develop higher topological robustness. Subsequently, this work presents an evolutionary algorithm that simulates natural evolution in silico, for identifying network topologies that are robust to perturbations. We present a Monte Carlo based method for quantifying topological robustness and designed a fitness approximation approach for efficient calculation of topological robustness which is computationally very intensive. The proposed framework was verified using two classic GRN behaviors: oscillation and bistability, although the framework is generalized for evolving other types of responses. The algorithm identified robust GRN architectures which were verified using different analysis and comparison. Analysis of the results also shed light on the relationship among robustness, cooperativity and complexity. This study also shows that nature has already evolved very robust architectures for its crucial systems; hence simulation of this natural process can be very valuable for designing robust biological systems. PMID:25616055

  3. Definition of a GC-rich motif as regulatory sequence of the human IL-3 gene: coordinate regulation of the IL-3 gene by CLE2/GC box of the GM-CSF gene in T cell activation.

    PubMed

    Nishida, J; Yoshida, M; Arai, K; Yokota, T

    1991-03-01

    The human IL-3 gene, located on chromosome 5, contains several cis-acting DNA sequences, i.e. CLE (conserved lymphokine element) and a GC-rich region, similar to the GM-CSF gene. To investigate the role of these elements, the 5' flanking region of the IL-3 gene was attached to a bacterial chloramphenicol acetyltransferase (CAT) gene. The fusion plasmids were analyzed by an in vitro transcription system using Jurkat cell nuclear extract prepared from cells stimulated with phorbol-12-myristate-13-acetate and calcium ionophore (PMA/A23187), introduced into Jurkat cells, expressed transiently, and stimulated by co-transfection of human T cell leukemia virus type I (HTLV-I) encoded transactivator, p40tax. The GC-rich region enhanced TATA-dependent transcription in the in vitro transcription system and also strongly responded to p40tax stimulation in the in vivo cotransfection assay. Using this GC-rich region as a probe, we identified a constitutive DNA-protein complex, alpha, whose binding specificity correlates with transcription activity. However, this element is not sufficient for the expression of the IL-3 gene in response to T cell activation signals (PMA/A23187) and no sequence was found within the IL-3 gene which mediates the response to PMA/A23187. The enhancer sequence which responds to T cell activation signals may be located outside the IL-3 gene and may be shared by other lymphokines, possibly by GM-CSF. We propose that the GM-CSF enhancer (CLE2/GC box) which mediates the response to T cell activation signals may stimulate the expression of the IL-3 gene.

  4. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  5. A cis-Regulatory Signature for Chordate Anterior Neuroectodermal Genes

    PubMed Central

    Christiaen, Lionel; Joly, Jean-Stéphane

    2010-01-01

    One of the striking findings of comparative developmental genetics was that expression patterns of core transcription factors are extraordinarily conserved in bilaterians. However, it remains unclear whether cis-regulatory elements of their target genes also exhibit common signatures associated with conserved embryonic fields. To address this question, we focused on genes that are active in the anterior neuroectoderm and non-neural ectoderm of the ascidian Ciona intestinalis. Following the dissection of a prototypic anterior placodal enhancer, we searched all genomic conserved non-coding elements for duplicated motifs around genes showing anterior neuroectodermal expression. Strikingly, we identified an over-represented pentamer motif corresponding to the binding site of the homeodomain protein OTX, which plays a pivotal role in the anterior development of all bilaterian species. Using an in vivo reporter gene assay, we observed that 10 of 23 candidate cis-regulatory elements containing duplicated OTX motifs are active in the anterior neuroectoderm, thus showing that this cis-regulatory signature is predictive of neuroectodermal enhancers. These results show that a common cis-regulatory signature corresponding to K50-Paired homeodomain transcription factors is found in non-coding sequences flanking anterior neuroectodermal genes in chordate embryos. Thus, field-specific selector genes impose architectural constraints in the form of combinations of short tags on their target enhancers. This could account for the strong evolutionary conservation of the regulatory elements controlling field-specific selector genes responsible for body plan formation. PMID:20419150

  6. Comparative studies of gene regulatory mechanisms.

    PubMed

    Pai, Athma A; Gilad, Yoav

    2014-12-01

    It has become increasingly clear that changes in gene regulation have played an important role in adaptive evolution both between and within species. Over the past five years, comparative studies have moved beyond simple characterizations of differences in gene expression levels within and between species to studying variation in regulatory mechanisms. We still know relatively little about the precise chain of events that lead to most regulatory adaptations, but we have taken significant steps towards understanding the relative importance of changes in different mechanisms of gene regulatory evolution. In this review, we first discuss insights from comparative studies in model organisms, where the available experimental toolkit is extensive. We then focus on a few recent comparative studies in primates, where the limited feasibility of experimental manipulation dictates the approaches that can be used to study gene regulatory evolution.

  7. Multiple regulatory mechanisms of hepatocyte growth factor expression in malignant cells with a short poly(dA) sequence in the HGF gene promoter.

    PubMed

    Sakai, Kazuko; Takeda, Masayuki; Okamoto, Isamu; Nakagawa, Kazuhiko; Nishio, Kazuto

    2015-01-01

    Hepatocyte growth factor (HGF) expression is a poor prognostic factor in various types of cancer. Expression levels of HGF have been reported to be regulated by shorter poly(dA) sequences in the promoter region. In the present study, the poly(dA) mononucleotide tract in various types of human cancer cell lines was examined and compared with the HGF expression levels in those cells. Short deoxyadenosine repeat sequences were detected in five of the 55 cell lines used in the present study. The H69, IM95, CCK-81, Sui73 and H28 cells exhibited a truncated poly(dA) sequence in which the number of poly(dA) repeats was reduced by ≥5 bp. Two of the cell lines exhibited high HGF expression, determined by reverse transcription quantitative polymerase chain reaction and enzyme-linked immunosorbent assay. The CCK-81, Sui73 and H28 cells with shorter poly(dA) sequences exhibited low HGF expression. The cause of the suppression of HGF expression in the CCK-81, Sui73 and H28 cells was clarified by two approaches, suppression by methylation and single nucleotide polymorphisms in the HGF gene. Exposure to 5-Aza-dC, an inhibitor of DNA methyltransferase 1, induced an increased expression of HGF in the CCK-81 cells, but not in the other cells. Single-nucleotide polymorphism (SNP) rs72525097 in intron 1 was detected in the Sui73 and H28 cells. Taken together, it was found that the defect of poly(dA) in the HGF promoter was present in various types of cancer, including lung, stomach, colorectal, pancreas and mesothelioma. The present study proposes the negative regulation mechanisms by methylation and SNP in intron 1 of HGF for HGF expression in cancer cells with short poly(dA).

  8. Formation of Regulatory Modules by Local Sequence Duplication

    PubMed Central

    Nourmohammad, Armita; Lässig, Michael

    2011-01-01

    Turnover of regulatory sequence and function is an important part of molecular evolution. But what are the modes of sequence evolution leading to rapid formation and loss of regulatory sites? Here we show that a large fraction of neighboring transcription factor binding sites in the fly genome have formed from a common sequence origin by local duplications. This mode of evolution is found to produce regulatory information: duplications can seed new sites in the neighborhood of existing sites. Duplicate seeds evolve subsequently by point mutations, often towards binding a different factor than their ancestral neighbor sites. These results are based on a statistical analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome, and a comparison set of intergenic regulatory sequence in Saccharomyces cerevisiae. In fly regulatory modules, pairs of binding sites show significantly enhanced sequence similarity up to distances of about 50 bp. We analyze these data in terms of an evolutionary model with two distinct modes of site formation: (i) evolution from independent sequence origin and (ii) divergent evolution following duplication of a common ancestor sequence. Our results suggest that pervasive formation of binding sites by local sequence duplications distinguishes the complex regulatory architecture of higher eukaryotes from the simpler architecture of unicellular organisms. PMID:21998564

  9. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  10. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  11. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  12. Regulatory myosin light-chain genes of Caenorhabditis elegans.

    PubMed Central

    Cummins, C; Anderson, P

    1988-01-01

    We have cloned and analyzed the Caenorhabditis elegans regulatory myosin light-chain genes. C. elegans contains two such genes, which we have designated mlc-1 and mlc-2. The two genes are separated by 2.6 kilobases and are divergently transcribed. We determined the complete nucleotide sequences of both mlc-1 and mlc-2. A single, conservative amino acid substitution distinguishes the sequences of the two proteins. The C. elegans proteins are strongly homologous to regulatory myosin light chains of Drosophila melanogaster and vertebrates and weakly homologous to a superfamily of eucaryotic calcium-binding proteins. Both mlc-1 and mlc-2 encode abundant mRNAs. We mapped the 5' termini of these transcripts by using primer extension sequencing of mRNA templates. mlc-1 mRNAs initiate within conserved hexanucleotides at two different positions, located at -28 and -38 relative to the start of translation. The 5' terminus of mlc-2 mRNA is not encoded in the 4.8-kilobase genomic region upstream of mlc-2. Rather, mlc-2 mRNA contains at its 5' end a short, untranslated leader sequence that is identical to the trans-spliced leader sequence of three C. elegans actin genes. Images PMID:3244358

  13. A gene regulatory network controlling the embryonic specification of endoderm.

    PubMed

    Peter, Isabelle S; Davidson, Eric H

    2011-05-29

    Specification of endoderm is the prerequisite for gut formation in the embryogenesis of bilaterian organisms. Modern lineage labelling studies have shown that in the sea urchin embryo model system, descendants of the veg1 and veg2 cell lineages produce the endoderm, and that the veg2 lineage also gives rise to mesodermal cell types. It is known that Wnt/β-catenin signalling is required for endoderm specification and Delta/Notch signalling is required for mesoderm specification. Some direct cis-regulatory targets of these signals have been found and various phenomenological patterns of gene expression have been observed in the pre-gastrular endomesoderm. However, no comprehensive, causal explanation of endoderm specification has been conceived for sea urchins, nor for any other deuterostome. Here we propose a model, on the basis of the underlying genomic control system, that provides such an explanation, built at several levels of biological organization. The hardwired core of the control system consists of the cis-regulatory apparatus of endodermal regulatory genes, which determine the relationship between the inputs to which these genes are exposed and their outputs. The architecture of the network circuitry controlling the dynamic process of endoderm specification then explains, at the system level, a sequence of developmental logic operations, which generate the biological process. The control system initiates non-interacting endodermal and mesodermal gene regulatory networks in veg2-derived cells and extinguishes the endodermal gene regulatory network in mesodermal precursors. It also generates a cross-regulatory network that specifies future anterior endoderm in veg2 descendants and institutes a distinct network specifying posterior endoderm in veg1-derived cells. The network model provides an explanatory framework that relates endoderm specification to the genomic regulatory code.

  14. Evolutionary conservation of the eumetazoan gene regulatory landscape

    PubMed Central

    Schwaiger, Michaela; Schönauer, Anna; Rendeiro, André F.; Pribitzer, Carina; Schauer, Alexandra; Gilles, Anna F.; Schinko, Johannes B.; Renfer, Eduard; Fredman, David; Technau, Ulrich

    2014-01-01

    Despite considerable differences in morphology and complexity of body plans among animals, a great part of the gene set is shared among Bilateria and their basally branching sister group, the Cnidaria. This suggests that the common ancestor of eumetazoans already had a highly complex gene repertoire. At present it is therefore unclear how morphological diversification is encoded in the genome. Here we address the possibility that differences in gene regulation could contribute to the large morphological divergence between cnidarians and bilaterians. To this end, we generated the first genome-wide map of gene regulatory elements in a nonbilaterian animal, the sea anemone Nematostella vectensis. Using chromatin immunoprecipitation followed by deep sequencing of five chromatin modifications and a transcriptional cofactor, we identified over 5000 enhancers in the Nematostella genome and could validate 75% of the tested enhancers in vivo. We found that in Nematostella, but not in yeast, enhancers are characterized by the same combination of histone modifications as in bilaterians, and these enhancers preferentially target developmental regulatory genes. Surprisingly, the distribution and abundance of gene regulatory elements relative to these genes are shared between Nematostella and bilaterian model organisms. Our results suggest that complex gene regulation originated at least 600 million yr ago, predating the common ancestor of eumetazoans. PMID:24642862

  15. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila.

    PubMed

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R

    2016-09-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa.

  16. Combinatorial Gene Regulatory Functions Underlie Ultraconserved Elements in Drosophila

    PubMed Central

    Warnefors, Maria; Hartmann, Britta; Thomsen, Stefan; Alonso, Claudio R.

    2016-01-01

    Ultraconserved elements (UCEs) are discrete genomic elements conserved across large evolutionary distances. Although UCEs have been linked to multiple facets of mammalian gene regulation their extreme evolutionary conservation remains largely unexplained. Here, we apply a computational approach to investigate this question in Drosophila, exploring the molecular functions of more than 1,500 UCEs shared across the genomes of 12 Drosophila species. Our data indicate that Drosophila UCEs are hubs for gene regulatory functions and suggest that UCE sequence invariance originates from their combinatorial roles in gene control. We also note that the gene regulatory roles of intronic and intergenic UCEs (iUCEs) are distinct from those found in exonic UCEs (eUCEs). In iUCEs, transcription factor (TF) and epigenetic factor binding data strongly support iUCE roles in transcriptional and epigenetic regulation. In contrast, analyses of eUCEs indicate that they are two orders of magnitude more likely than the expected to simultaneously include protein-coding sequence, TF-binding sites, splice sites, and RNA editing sites but have reduced roles in transcriptional or epigenetic regulation. Furthermore, we use a Drosophila cell culture system and transgenic Drosophila embryos to validate the notion of UCE combinatorial regulatory roles using an eUCE within the Hox gene Ultrabithorax and show that its protein-coding region also contains alternative splicing regulatory information. Taken together our experiments indicate that UCEs emerge as a result of combinatorial gene regulatory roles and highlight common features in mammalian and insect UCEs implying that similar processes might underlie ultraconservation in diverse animal taxa. PMID:27247329

  17. Efficiently finding regulatory elements using correlation with gene expression.

    PubMed

    Bannai, Hideo; Inenaga, Shunsuke; Shinohara, Ayumi; Takeda, Masayuki; Miyano, Satoru

    2004-06-01

    We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.

  18. Phenotypic switching in gene regulatory networks.

    PubMed

    Thomas, Philipp; Popović, Nikola; Grima, Ramon

    2014-05-13

    Noise in gene expression can lead to reversible phenotypic switching. Several experimental studies have shown that the abundance distributions of proteins in a population of isogenic cells may display multiple distinct maxima. Each of these maxima may be associated with a subpopulation of a particular phenotype, the quantification of which is important for understanding cellular decision-making. Here, we devise a methodology which allows us to quantify multimodal gene expression distributions and single-cell power spectra in gene regulatory networks. Extending the commonly used linear noise approximation, we rigorously show that, in the limit of slow promoter dynamics, these distributions can be systematically approximated as a mixture of Gaussian components in a wide class of networks. The resulting closed-form approximation provides a practical tool for studying complex nonlinear gene regulatory networks that have thus far been amenable only to stochastic simulation. We demonstrate the applicability of our approach in a number of genetic networks, uncovering previously unidentified dynamical characteristics associated with phenotypic switching. Specifically, we elucidate how the interplay of transcriptional and translational regulation can be exploited to control the multimodality of gene expression distributions in two-promoter networks. We demonstrate how phenotypic switching leads to birhythmical expression in a genetic oscillator, and to hysteresis in phenotypic induction, thus highlighting the ability of regulatory networks to retain memory.

  19. Binding of tissue-specific forms of alpha A-CRYBP1 to their regulatory sequence in the mouse alpha A-crystallin-encoding gene: double-label immunoblotting of UV-crosslinked complexes.

    PubMed

    Kantorow, M; Becker, K; Sax, C M; Ozato, K; Piatigorsky, J

    1993-09-15

    The alpha A-CRYBP1 regulatory sequence (alpha A-CRYBP1RS), at nucleotides -66 to -57 of the mouse alpha A-crystallin-encoding gene (alpha A-CRY) promoter, is an important control element involved in the regulation of mouse alpha A-CRY expression. The gene encoding a protein (alpha A-CRYBP1) that specifically binds to the alpha A-CRYBP1RS sequence has been cloned from a cultured mouse lens cell line. In the present study, we have used an antibody (specific to the alpha A-CRYBP1 protein and made against a synthetic peptide) to directly identify UV-crosslinked protein-DNA complexes via a double-label immunoblotting technique. Multiple alpha A-CRYB1 antigenically related proteins interacted with alpha A-CRYBP1RS in nuclear extracts from both a cloned mouse lens cell line (alpha TN4-1) that expresses alpha A-CRY and a mouse fibroblast line (L929) that does not express the gene. Two sizes (50 kDa and 90 kDa) of proteins reacting with the alpha A-CRYBP1-specific Ab were detected in both cell lines and, in addition, a > 200-kDa protein reacting with the Ab was unique to the fibroblast line. Thus, alpha A-CRYBP1 antigenically related proteins interact with alpha A-CRYBP1RS regardless of alpha A-CRY expression. Moreover, differential processing of the alpha A-CRYBP1 protein and/or alternative splicing of the alpha A-CRY transcript may affect expression of alpha A-CRY.

  20. Housekeeping genes tend to show reduced upstream sequence conservation

    PubMed Central

    Farré, Domènec; Bellora, Nicolás; Mularoni, Loris; Messeguer, Xavier; Albà, M Mar

    2007-01-01

    Background Understanding the constraints that operate in mammalian gene promoter sequences is of key importance to understand the evolution of gene regulatory networks. The level of promoter conservation varies greatly across orthologous genes, denoting differences in the strength of the evolutionary constraints. Here we test the hypothesis that the number of tissues in which a gene is expressed is related in a significant manner to the extent of promoter sequence conservation. Results We show that mammalian housekeeping genes, expressed in all or nearly all tissues, show significantly lower promoter sequence conservation, especially upstream of position -500 with respect to the transcription start site, than genes expressed in a subset of tissues. In addition, we evaluate the effect of gene function, CpG island content and protein evolutionary rate on promoter sequence conservation. Finally, we identify a subset of transcription factors that bind to motifs that are specifically over-represented in housekeeping gene promoters. Conclusion This is the first report that shows that the promoters of housekeeping genes show reduced sequence conservation with respect to genes expressed in a more tissue-restricted manner. This is likely to be related to simpler gene expression, requiring a smaller number of functional cis-regulatory motifs. PMID:17626644

  1. Gene regulatory networks and the underlying biology of developmental toxicity

    EPA Science Inventory

    Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...

  2. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

    PubMed Central

    Kuntz, Steven G.; Schwarz, Erich M.; DeModena, John A.; De Buysscher, Tristan; Trout, Diane; Shizuya, Hiroaki; Sternberg, Paul W.; Wold, Barbara J.

    2008-01-01

    To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced ∼0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide. PMID:18981268

  3. Autonomous Boolean modeling of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Socolar, Joshua; Sun, Mengyang; Cheng, Xianrui

    2014-03-01

    In cases where the dynamical properties of gene regulatory networks are important, a faithful model must include three key features: a network topology; a functional response of each element to its inputs; and timing information about the transmission of signals across network links. Autonomous Boolean network (ABN) models are efficient representations of these elements and are amenable to analysis. We present an ABN model of the gene regulatory network governing cell fate specification in the early sea urchin embryo, which must generate three bands of distinct tissue types after several cell divisions, beginning from an initial condition with only two distinct cell types. Analysis of the spatial patterning problem and the dynamics of a network constructed from available experimental results reveals that a simple mechanism is at work in this case. Supported by NSF Grant DMS-10-68602

  4. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

    PubMed

    Narang, Vipin; Ramli, Muhamad Azfar; Singhal, Amit; Kumar, Pavanish; de Libero, Gennaro; Poidinger, Michael; Monterola, Christopher

    2015-01-01

    Human gene regulatory networks (GRN) can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs). Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data) accompanying this manuscript.

  5. Thermodynamics-based models of transcriptional regulation with gene sequence.

    PubMed

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  6. Regulatory Features for Odorant Receptor Genes in the Mouse Genome

    PubMed Central

    Degl’Innocenti, Andrea; D’Errico, Anna

    2017-01-01

    The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron–one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice. Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci, where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus. Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes. PMID:28270833

  7. Regulatory Features for Odorant Receptor Genes in the Mouse Genome.

    PubMed

    Degl'Innocenti, Andrea; D'Errico, Anna

    2017-01-01

    The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron-one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice. Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci, where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus. Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes.

  8. Dynamic Gene Regulatory Networks of Human Myeloid Differentiation.

    PubMed

    Ramirez, Ricardo N; El-Ali, Nicole C; Mager, Mikayla Anne; Wyman, Dana; Conesa, Ana; Mortazavi, Ali

    2017-03-27

    The reconstruction of gene regulatory networks underlying cell differentiation from high-throughput gene expression and chromatin data remains a challenge. Here, we derive dynamic gene regulatory networks for human myeloid differentiation using a 5-day time series of RNA-seq and ATAC-seq data. We profile HL-60 promyelocytes differentiating into macrophages, neutrophils, monocytes, and monocyte-derived macrophages. We find a rapid response in the expression of key transcription factors and lineage markers that only regulate a subset of their targets at a given time, which is followed by chromatin accessibility changes that occur later along with further gene expression changes. We observe differences between promyelocyte- and monocyte-derived macrophages at both the transcriptional and chromatin landscape level, despite using the same differentiation stimulus, which suggest that the path taken by cells in the differentiation landscape defines their end cell state. More generally, our approach of combining neighboring time points and replicates to achieve greater sequencing depth can efficiently infer footprint-based regulatory networks from long series data.

  9. Gene regulatory logic of dopaminergic neuron differentiation

    PubMed Central

    Flames, Nuria; Hobert, Oliver

    2009-01-01

    Dopamine signaling regulates a variety of complex behaviors and defects in dopaminergic neuron function or survival result in severe human pathologies, such as Parkinson's disease 1. The common denominator of all dopaminergic neurons is the expression of dopamine pathway genes, which code for a set of phylogenetically conserved proteins involved in dopamine synthesis and transport. Gene regulatory mechanisms that result in the activation of dopamine pathway genes and thereby ultimately determine the identity of dopaminergic neurons are poorly understood in any system studied to date 2. We show here that a simple cis-regulatory element, the DA motif, controls the expression of all dopamine pathway genes in all dopaminergic cell types in C. elegans. The DA motif is activated by the ETS transcription factor, AST-1. Loss of ast-1 results in the failure of all distinct dopaminergic neuronal subtypes to terminally differentiate. Ectopic expression of ast-1 is sufficient to activate the dopamine production pathway in some cellular contexts. Vertebrate dopaminergic pathway genes also contain phylogenetically conserved DA motifs that can be activated by the mouse ETS transcription factor Etv1/ER81 and a specific class of dopaminergic neurons fails to differentiate in mice lacking Etv1/ER81. Moreover, ectopic Etv1/ER81 expression induces dopaminergic fate marker expression in neuronal primary cultures. Mouse Etv1/ER81 can also functionally substitute for ast-1 in C.elegans. Our studies reveal an astoundingly simple and apparently conserved regulatory logic of dopaminergic neuron terminal differentiation and may provide new entry points into the diagnosis or therapy of conditions in which dopamine neurons are defective. PMID:19287374

  10. Genome-wide network of regulatory genes for construction of a chordate embryo.

    PubMed

    Shoguchi, Eiichi; Hamaguchi, Makoto; Satoh, Nori

    2008-04-15

    Animal development is controlled by gene regulation networks that are composed of sequence-specific transcription factors (TF) and cell signaling molecules (ST). Although housekeeping genes have been reported to show clustering in the animal genomes, whether the genes comprising a given regulatory network are physically clustered on a chromosome is uncertain. We examined this question in the present study. Ascidians are the closest living relatives of vertebrates, and their tadpole-type larva represents the basic body plan of chordates. The Ciona intestinalis genome contains 390 core TF genes and 119 major ST genes. Previous gene disruption assays led to the formulation of a basic chordate embryonic blueprint, based on over 3000 genetic interactions among 79 zygotic regulatory genes. Here, we mapped the regulatory genes, including all 79 regulatory genes, on the 14 pairs of Ciona chromosomes by fluorescent in situ hybridization (FISH). Chromosomal localization of upstream and downstream regulatory genes demonstrates that the components of coherent developmental gene networks are evenly distributed over the 14 chromosomes. Thus, this study provides the first comprehensive evidence that the physical clustering of regulatory genes, or their target genes, is not relevant for the genome-wide control of gene expression during development.

  11. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    PubMed Central

    Pavesi, Giulio; Zambelli, Federico; Pesole, Graziano

    2007-01-01

    Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes. PMID:17286865

  12. Beyond antioxidant genes in the ancient NRF2 regulatory network

    PubMed Central

    Lacher, Sarah E.; Lee, Joslynn S.; Wang, Xuting; Campbell, Michelle R.; Bell, Douglas A.; Slattery, Matthew

    2016-01-01

    NRF2, a basic leucine zipper transcription factor encoded by the gene NFE2L2, is a master regulator of the transcriptional response to oxidative stress. NRF2 is structurally and functionally conserved from insects to humans, and it heterodimerizes with the small MAF transcription factors to bind a consensus DNA sequence (the antioxidant response element, or ARE) and regulate gene expression. We have used genome-wide chromatin immunoprecipitation (ChIP-seq) and gene expression data to identify direct NRF2 target genes in Drosophila and humans. These data have allowed us to construct the deeply conserved ancient NRF2 regulatory network – target genes that are conserved from Drosophila to human. The ancient network consists of canonical antioxidant genes, as well as genes related to proteasomal pathways, metabolism, and a number of less expected genes. We have also used enhancer reporter assays and electrophoretic mobility shift assays to confirm NRF2-mediated regulation of ARE (antioxidant response element) activity at a number of these novel target genes. Interestingly, the ancient network also highlights a prominent negative feedback loop; this, combined with the finding that and NRF2-mediated regulatory output is tightly linked to the quality of the ARE it is targeting, suggests that precise regulation of nuclear NRF2 concentration is necessary to achieve proper quantitative regulation of distinct gene sets. Together, these findings highlight the importance of balance in the NRF2-ARE pathway, and indicate that NRF2-mediated regulation of xenobiotic metabolism, glucose metabolism, and proteostasis have been central to this pathway since its inception. PMID:26163000

  13. A Provisional Gene Regulatory Atlas for Mouse Heart Development

    PubMed Central

    Chen, Hailin; VanBuren, Vincent

    2014-01-01

    Congenital Heart Disease (CHD) is one of the most common birth defects. Elucidating the molecular mechanisms underlying normal cardiac development is an important step towards early identification of abnormalities during the developmental program and towards the creation of early intervention strategies. We developed a novel computational strategy for leveraging high-content data sets, including a large selection of microarray data associated with mouse cardiac development, mouse genome sequence, ChIP-seq data of selected mouse transcription factors and Y2H data of mouse protein-protein interactions, to infer the active transcriptional regulatory network of mouse cardiac development. We identified phase-specific expression activity for 765 overlapping gene co-expression modules that were defined for obtained cardiac lineage microarray data. For each co-expression module, we identified the phase of cardiac development where gene expression for that module was higher than other phases. Co-expression modules were found to be consistent with biological pathway knowledge in Wikipathways, and met expectations for enrichment of pathways involved in heart lineage development. Over 359,000 transcription factor-target relationships were inferred by analyzing the promoter sequences within each gene module for overrepresentation against the JASPAR database of Transcription Factor Binding Site (TFBS) motifs. The provisional regulatory network will provide a framework of studying the genetic basis of CHD. PMID:24421884

  14. Generation of oscillating gene regulatory network motifs

    NASA Astrophysics Data System (ADS)

    van Dorp, M.; Lannoo, B.; Carlon, E.

    2013-07-01

    Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.

  15. Synthetic muscle promoters: activities exceeding naturally occurring regulatory sequences

    NASA Technical Reports Server (NTRS)

    Li, X.; Eastman, E. M.; Schwartz, R. J.; Draghia-Akli, R.

    1999-01-01

    Relatively low levels of expression from naturally occurring promoters have limited the use of muscle as a gene therapy target. Myogenic restricted gene promoters display complex organization usually involving combinations of several myogenic regulatory elements. By random assembly of E-box, MEF-2, TEF-1, and SRE sites into synthetic promoter recombinant libraries, and screening of hundreds of individual clones for transcriptional activity in vitro and in vivo, several artificial promoters were isolated whose transcriptional potencies greatly exceed those of natural myogenic and viral gene promoters.

  16. Identifying genes of gene regulatory networks using formal concept analysis.

    PubMed

    Gebert, Jutta; Motameny, Susanne; Faigle, Ulrich; Forst, Christian V; Schrader, Rainer

    2008-03-01

    In order to understand the behavior of a gene regulatory network, it is essential to know the genes that belong to it. Identifying the correct members (e.g., in order to build a model) is a difficult task even for small subnetworks. Usually only few members of a network are known and one needs to guess the missing members based on experience or informed speculation. It is beneficial if one can additionally rely on experimental data to support this guess. In this work we present a new method based on formal concept analysis to detect unknown members of a gene regulatory network from gene expression time series data. We show that formal concept analysis is able to find a list of candidate genes for inclusion into a partially known basic network. This list can then be reduced by a statistical analysis so that the resulting genes interact strongly with the basic network and therefore should be included when modeling the network. The method has been applied to the DNA repair system of Mycobacterium tuberculosis. In this application, our method produces comparable results to an already existing method of component selection while it is applicable to a broader range of problems.

  17. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus.

    PubMed

    Garfield, David; Haygood, Ralph; Nielsen, William J; Wray, Gregory A

    2012-01-01

    Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.

  18. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    SciTech Connect

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.; Loots, Gabriela G.; Houston, Kathryn A.; Dubchak, Inna; Speed, Terence P.; Rubin, Edward M.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs in gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.

  19. Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA

    PubMed Central

    Turner, Tychele N.; Hormozdiari, Fereydoun; Duyzend, Michael H.; McClymont, Sarah A.; Hook, Paul W.; Iossifov, Ivan; Raja, Archana; Baker, Carl; Hoekzema, Kendra; Stessman, Holly A.; Zody, Michael C.; Nelson, Bradley J.; Huddleston, John; Sandstrom, Richard; Smith, Joshua D.; Hanna, David; Swanson, James M.; Faustman, Elaine M.; Bamshad, Michael J.; Stamatoyannopoulos, John; Nickerson, Deborah A.; McCallion, Andrew S.; Darnell, Robert; Eichler, Evan E.

    2016-01-01

    We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism. PMID:26749308

  20. Regulatory gene networks and the properties of the developmental process

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; McClay, David R.; Hood, Leroy

    2003-01-01

    Genomic instructions for development are encoded in arrays of regulatory DNA. These specify large networks of interactions among genes producing transcription factors and signaling components. The architecture of such networks both explains and predicts developmental phenomenology. Although network analysis is yet in its early stages, some fundamental commonalities are already emerging. Two such are the use of multigenic feedback loops to ensure the progressivity of developmental regulatory states and the prevalence of repressive regulatory interactions in spatial control processes. Gene regulatory networks make it possible to explain the process of development in causal terms and eventually will enable the redesign of developmental regulatory circuitry to achieve different outcomes.

  1. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar; Otim, Ochan; Brown, C. Titus; Livi, Carolina B.; Lee, Pei Yun; Revilla, Roger; Schilstra, Maria J.; Clarke, Peter J C.; Rust, Alistair G.; Pan, Zhengjun; Arnone, Maria I.; Rowen, Lee; Cameron, R. Andrew; McClay, David R.; Hood, Leroy; Bolouri, Hamid

    2002-01-01

    We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

  2. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  3. Discovering Study-Specific Gene Regulatory Networks

    PubMed Central

    Bo, Valeria; Curtis, Tanya; Lysenko, Artem; Saqi, Mansoor; Swift, Stephen; Tucker, Allan

    2014-01-01

    Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets. PMID:25191999

  4. Regulatory considerations for translating gene therapy: a European Union perspective.

    PubMed

    Galli, Maria Cristina

    2009-11-11

    A preclinical study on a gene therapy approach for treatment of the severe muscle weakness associated with a variety of neuromuscular disorders provides a forum to discuss the translational challenges of gene therapy from a regulatory point of view. In this Perspective, the findings are considered from the view of European regulatory requirements for first clinical use.

  5. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  6. Modeling stochastic noise in gene regulatory systems.

    PubMed

    Meister, Arwen; Du, Chao; Li, Ye Henry; Wong, Wing Hung

    2014-03-01

    The Master equation is considered the gold standard for modeling the stochastic mechanisms of gene regulation in molecular detail, but it is too complex to solve exactly in most cases, so approximation and simulation methods are essential. However, there is still a lack of consensus about the best way to carry these out. To help clarify the situation, we review Master equation models of gene regulation, theoretical approximations based on an expansion method due to N.G. van Kampen and R. Kubo, and simulation algorithms due to D.T. Gillespie and P. Langevin. Expansion of the Master equation shows that for systems with a single stable steady-state, the stochastic model reduces to a deterministic model in a first-order approximation. Additional theory, also due to van Kampen, describes the asymptotic behavior of multistable systems. To support and illustrate the theory and provide further insight into the complex behavior of multistable systems, we perform a detailed simulation study comparing the various approximation and simulation methods applied to synthetic gene regulatory systems with various qualitative characteristics. The simulation studies show that for large stochastic systems with a single steady-state, deterministic models are quite accurate, since the probability distribution of the solution has a single peak tracking the deterministic trajectory whose variance is inversely proportional to the system size. In multistable stochastic systems, large fluctuations can cause individual trajectories to escape from the domain of attraction of one steady-state and be attracted to another, so the system eventually reaches a multimodal probability distribution in which all stable steady-states are represented proportional to their relative stability. However, since the escape time scales exponentially with system size, this process can take a very long time in large systems.

  7. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    SciTech Connect

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  8. Identification of the regulatory sequence of anaerobically expressed locus aeg-46.5.

    PubMed Central

    Choe, M; Reznikoff, W S

    1993-01-01

    A newly identified anaerobically expressed locus, aeg-46.5, which is located at min 46.5 on Escherichia coli linkage map, was cloned and analyzed. The phenotype of this gene was studied by using a lacZ operon fusion. aeg-46.5 is induced anaerobically in the presence of nitrate in wild-type and narL cells. It is repressed by the narL gene product, as it showed derepressed anaerobic expression in narL mutant cells. We postulate that aeg-46.5 is subject to multiple regulatory systems, activation as a result of anaerobiosis, narL-independent nitrate-dependent activation, and narL-mediated repression. The regulatory region of aeg-46.5 was identified. A 304-bp DNA sequence which includes the regulatory elements was obtained, and the 5' end of aeg-46.5 mRNA was identified. It was verified that the anaerobic regulation of aeg-46.5 expression is controlled on the transcriptional level. Computer analysis predicted possible control sites for the NarL and FNR proteins. The proposed NarL site was found in a perfect-symmetry element. The aeg-46.5 regulatory elements are adjacent to, but divergent from, those of the eco gene. Images PMID:8432709

  9. Evolution of the mammalian embryonic pluripotency gene regulatory network

    PubMed Central

    Fernandez-Tresguerres, Beatriz; Cañon, Susana; Rayon, Teresa; Pernaute, Barbara; Crespo, Miguel; Torroja, Carlos; Manzanares, Miguel

    2010-01-01

    Embryonic pluripotency in the mouse is established and maintained by a gene-regulatory network under the control of a core set of transcription factors that include octamer-binding protein 4 (Oct4; official name POU domain, class 5, transcription factor 1, Pou5f1), sex-determining region Y (SRY)-box containing gene 2 (Sox2), and homeobox protein Nanog. Although this network is largely conserved in eutherian mammals, very little information is available regarding its evolutionary conservation in other vertebrates. We have compared the embryonic pluripotency networks in mouse and chick by means of expression analysis in the pregastrulation chicken embryo, genomic comparisons, and functional assays of pluripotency-related regulatory elements in ES cells and blastocysts. We find that multiple components of the network are either novel to mammals or have acquired novel expression domains in early developmental stages of the mouse. We also find that the downstream action of the mouse core pluripotency factors is mediated largely by genomic sequence elements nonconserved with chick. In the case of Sox2 and Fgf4, we find that elements driving expression in embryonic pluripotent cells have evolved by a small number of nucleotide changes that create novel binding sites for core factors. Our results show that the network in charge of embryonic pluripotency is an evolutionary novelty of mammals that is related to the comparatively extended period during which mammalian embryonic cells need to be maintained in an undetermined state before engaging in early differentiation events. PMID:21048080

  10. Caenorhabditis elegans metabolic gene regulatory networks govern the cellular economy.

    PubMed

    Watson, Emma; Walhout, Albertha J M

    2014-10-01

    Diet greatly impacts metabolism in health and disease. In response to the presence or absence of specific nutrients, metabolic gene regulatory networks sense the metabolic state of the cell and regulate metabolic flux accordingly, for instance by the transcriptional control of metabolic enzymes. Here, we discuss recent insights regarding metazoan metabolic regulatory networks using the nematode Caenorhabditis elegans as a model, including the modular organization of metabolic gene regulatory networks, the prominent impact of diet on the transcriptome and metabolome, specialized roles of nuclear hormone receptors (NHRs) in responding to dietary conditions, regulation of metabolic genes and metabolic regulators by miRNAs, and feedback between metabolic genes and their regulators.

  11. Regulation of photoreceptor gene transcription via a highly conserved transcriptional regulatory element by vsx gene products

    PubMed Central

    Pan, Yi; Comiskey, Daniel F.; Kelly, Lisa E.; Chandler, Dawn S.

    2016-01-01

    Purpose The photoreceptor conserved element-1 (PCE-1) sequence is found in the transcriptional regulatory regions of many genes expressed in photoreceptors. The retinal homeobox (Rx or Rax) gene product functions by binding to PCE-1 sites. However, other transcriptional regulators have also been reported to bind to PCE-1. One of these, vsx2, is expressed in retinal progenitor and bipolar cells. The purpose of this study is to identify Xenopus laevis vsx gene products and characterize vsx gene product expression and function with respect to the PCE-1 site. Methods X. laevis vsx gene products were amplified with PCR. Expression patterns were determined with in situ hybridization using whole or sectioned X. laevis embryos and digoxigenin- or fluorescein-labeled antisense riboprobes. DNA binding characteristics of the vsx gene products were analyzed with electrophoretic mobility shift assays (EMSAs) using in vitro translated proteins and radiolabeled oligonucleotide probes. Gene transactivation assays were performed using luciferase-based reporters and in vitro transcribed effector gene products, injected into X. laevis embryos. Results We identified one vsx1 and two vsx2 gene products. The two vsx2 gene products are generated by alternate mRNA splicing. We verified that these gene products are expressed in the developing retina and that expression resolves into distinct cell types in the mature retina. Finally, we found that vsx gene products can bind the PCE-1 site in vitro and that the two vsx2 isoforms have different gene transactivation activities. Conclusions vsx gene products are expressed in the developing and mature neural retina. vsx gene products can bind the PCE-1 site in vitro and influence the expression of a rhodopsin promoter-luciferase reporter gene. The two isoforms of vsx have different gene transactivation activities in this reporter gene system. PMID:28003732

  12. Amylase and chitinase genes in Streptomyces lividans are regulated by reg1, a pleiotropic regulatory gene.

    PubMed Central

    Nguyen, J; Francou, F; Virolle, M J; Guérineau, M

    1997-01-01

    A regulatory gene, reg1, was identified in Streptomyces lividans. It encodes a 345-amino-acid protein (Reg1) which contains a helix-turn-helix DNA-binding motif in the N-terminal region. Reg1 exhibits similarity with the LacI/GalR family members over the entire sequence. It displays 95% identity with MalR (the repressor of malE in S. coelicolor), 65% identity with ORF-Sl (a putative regulatory gene of alpha-amylase of S. limosus), and 31% identity with CcpA (the carbon catabolite repressor in Bacillus subtilis). In S. lividans, the chromosomal disruption of reg1 affected the expression of several genes. The production of alpha-amylases of S. lividans and that of the alpha-amylase of S. limosus in S. lividans were enhanced in the reg1 mutant strains and relieved of carbon catabolite repression. As a result, the transcription level of the alpha-amylase of S. limosus was noticeably increased in the reg1 mutant strain. Moreover, the induction of chitinase production in S. lividans was relieved of carbon catabolite repression by glucose in the reg1 mutant strain, while the induction by chitin was lost. Therefore, reg1 can be regarded as a pleiotropic regulatory gene in S. lividans. PMID:9335287

  13. Deduced products of C4-dicarboxylate transport regulatory genes of Rhizobium leguminosarum are homologous to nitrogen regulatory gene products.

    PubMed Central

    Ronson, C W; Astwood, P M; Nixon, B T; Ausubel, F M

    1987-01-01

    We have sequenced two genes dctB and dctD required for the activation of the C4-dicarboxylate transport structural gene dctA in free-living Rhizobium leguminosarum. The hydropathic profile of the dctB gene product (DctB) suggested that its N-terminal region may be located in the periplasm and its C-terminal region in the cytoplasm. The C-terminal region of DctB was strongly conserved with similar regions of the products of several regulatory genes that may act as environmental sensors, including ntrB, envZ, virA, phoR, cpxA, and phoM. The N-terminal domains of the products of several regulatory genes thought to be transcriptional activators, including ntrC, ompR, virG, phoB and sfrA. In addition, the central and C-terminal regions of DctD were strongly conserved with the products of ntrC and nifA, transcriptional activators that require the alternate sigma factor rpoN (ntrA) as co-activator. The central region of DctD also contained a potential ATP-binding domain. These results are consistent with recent results that show that rpoN product is required for dctA activation, and suggest that DctB plus DctD-mediated transcriptional activation of dctA may be mechanistically similar to NtrB plus NtrC-mediated activation of glnA in E. coli. PMID:3671068

  14. Toward an orofacial gene regulatory network.

    PubMed

    Kousa, Youssef A; Schutte, Brian C

    2016-03-01

    Orofacial clefting is a common birth defect with significant morbidity. A panoply of candidate genes have been discovered through synergy of animal models and human genetics. Among these, variants in interferon regulatory factor 6 (IRF6) cause syndromic orofacial clefting and contribute risk toward isolated cleft lip and palate (1/700 live births). Rare variants in IRF6 can lead to Van der Woude syndrome (1/35,000 live births) and popliteal pterygium syndrome (1/300,000 live births). Furthermore, IRF6 regulates GRHL3 and rare variants in this downstream target can also lead to Van der Woude syndrome. In addition, a common variant (rs642961) in the IRF6 locus is found in 30% of the world's population and contributes risk for isolated orofacial clefting. Biochemical studies revealed that rs642961 abrogates one of four AP-2alpha binding sites. Like IRF6 and GRHL3, rare variants in TFAP2A can also lead to syndromic orofacial clefting with lip pits (branchio-oculo-facial syndrome). The literature suggests that AP-2alpha, IRF6 and GRHL3 are part of a pathway that is essential for lip and palate development. In addition to updating the pathways, players and pursuits, this review will highlight some of the current questions in the study of orofacial clefting.

  15. Genomic Aberrations Frequently Alter Chromatin Regulatory Genes in Chordoma

    PubMed Central

    Wang, Lu; Zehir, Ahmet; Nafa, Khedoudja; Zhou, Nengyi; Berger, Michael F.; Casanova, Jacklyn; Sadowska, Justyna; Lu, Chao; Allis, C. David; Gounder, Mrinal; Chandhanayingyong, Chandhanarat; Ladanyi, Marc; Boland, Patrick J; Hameed, Meera

    2016-01-01

    Chordoma is a rare primary bone neoplasm that is resistant to standard chemotherapies. Despite aggressive surgical management, local recurrence and metastasis is not uncommon. To identify the specific genetic aberrations that play key roles in chordoma pathogenesis, we utilized a genome-wide high-resolution SNP-array and next generation sequencing (NGS)-based molecular profiling platform to study 24 patient samples with typical histopathologic features of chordoma. Matching normal tissues were available for 16 samples. SNP-array analysis revealed nonrandom copy number losses across the genome, frequently involving 3, 9p, 1p, 14, 10, and 13. In contrast, copy number gain is uncommon in chordomas. Two minimum deleted regions were observed on 3p within a ~8 Mb segment at 3p21.1–p21.31, which overlaps SETD2, BAP1 and PBRM1. The minimum deleted region on 9p was mapped to CDKN2A locus at 9p21.3, and homozygous deletion of CDKN2A was detected in 5/22 chordomas (~23%). NGS-based molecular profiling demonstrated an extremely low level of mutation rate in chordomas, with an average of 0.5 mutations per sample for the 16 cases with matched normal. When the mutated genes were grouped based on molecular functions, many of the mutation events (~40%) were found in chromatin regulatory genes. The combined copy number and mutation profiling revealed that SETD2 is the single gene affected most frequently in chordomas, either by deletion or by mutations. Our study demonstrated that chordoma belongs to the C-class (copy number changes) tumors whose oncogenic signature is non-random multiple copy number losses across the genome and genomic aberrations frequently alter chromatin regulatory genes. PMID:27072194

  16. Preservation of Gene Duplication Increases the Regulatory Spectrum of Ribosomal Protein Genes and Enhances Growth under Stress.

    PubMed

    Parenteau, Julie; Lavoie, Mathieu; Catala, Mathieu; Malik-Ghulam, Mustafa; Gagnon, Jules; Abou Elela, Sherif

    2015-12-22

    In baker's yeast, the majority of ribosomal protein genes (RPGs) are duplicated, and it was recently proposed that such duplications are preserved via the functional specialization of the duplicated genes. However, the origin and nature of duplicated RPGs' (dRPGs) functional specificity remain unclear. In this study, we show that differences in dRPG functions are generated by variations in the modality of gene expression and, to a lesser extent, by protein sequence. Analysis of the sequence and expression patterns of non-intron-containing RPGs indicates that each dRPG is controlled by specific regulatory sequences modulating its expression levels in response to changing growth conditions. Homogenization of dRPG sequences reduces cell tolerance to growth under stress without changing the number of expressed genes. Together, the data reveal a model where duplicated genes provide a means for modulating the expression of ribosomal proteins in response to stress.

  17. The structure of the human peripherin gene (PRPH) and identification of potential regulatory elements

    SciTech Connect

    Foley, J.; Ley, C.A.; Parysek, L.M.

    1994-07-15

    The authors determined the complete nucleotide sequence of the coding region of the human peripherin gene (PRPH), as well as 742 bp 5{prime} to the cap site and 584 bp 3{prime} to the stop codon, and compared its structure and sequence to the rat and mouse genes. The overall structure of 9 exons separated by 8 introns is conserved among these three mammalian species. The nucleotide sequences of the human peripherin gene exons were 90% identical to the rat gene sequences, and the predicted human peripherin protein differed from rat peripherin at only 18 of 475 amino acid residues. Comparison of the 5{prime} flanking regions of the human peripherin gene and rodent genes revealed extensive areas of high homology. Additional conserved segments were found in introns 1 and 2. Within the 5{prime} region, potential regulatory sequences, including a nerve growth factor negative regulatory element, a Hox protein binding site, and a heat shock element, were identified in all peripherin genes. The positional conservation of each element suggests that they may be important in the tissue-specific, developmental-specific, and injury-specific expression of the peripherin gene. 24 refs., 2 figs., 1 tab.

  18. A General Approach for Identifying Distant Regulatory Elements Applied to the Gdf6 Gene

    PubMed Central

    Mortlock, Douglas P.; Guenther, Catherine; Kingsley, David M.

    2003-01-01

    Regulatory sequences in higher genomes can map large distances from gene coding regions, and cannot yet be identified by simple inspection of primary DNA sequence information. Here we describe an efficient method of surveying large genomic regions for gene regulatory information, and subdividing complex sets of distant regulatory elements into smaller intervals for detailed study. The mouse Gdf6 gene is expressed in a number of distinct embryonic locations that are involved in the patterning of skeletal and soft tissues. To identify sequences responsible for Gdf6 regulation, we first isolated a series of overlapping bacterial artificial chromosomes (BACs) that extend varying distances upstream and downstream of the gene. A LacZ reporter cassette was integrated into the Gdf6 transcription unit of each BAC using homologous recombination in bacteria. Each modified BAC was injected into fertilized mouse eggs, and founder transgenic embryos were analyzed for LacZ expression mid-gestation. The overlapping segments defined by the BAC clones revealed five separate regulatory regions that drive LacZ expression in 11 distinct anatomical locations. To further localize sequences that control expression in developing skeletal joints, we created a series of BAC constructs with precise deletions across a putative joint-control region. This approach further narrowed the critical control region to an area containing several stretches of sequence that are highly conserved between mice and humans. A distant 2.9-kilobase fragment containing the highly conserved regions is able to direct very specific expression of a minimal promoter/LacZ reporter in proximal limb joints. These results demonstrate that even distant, complex regulatory sequences can be identified using a combination of BAC scanning, BAC deletion, and comparative sequencing approaches. PMID:12915490

  19. Computational inference of gene regulatory networks: Approaches, limitations and opportunities.

    PubMed

    Banf, Michael; Rhee, Seung Y

    2017-01-01

    Gene regulatory networks lie at the core of cell function control. In E. coli and S. cerevisiae, the study of gene regulatory networks has led to the discovery of regulatory mechanisms responsible for the control of cell growth, differentiation and responses to environmental stimuli. In plants, computational rendering of gene regulatory networks is gaining momentum, thanks to the recent availability of high-quality genomes and transcriptomes and development of computational network inference approaches. Here, we review current techniques, challenges and trends in gene regulatory network inference and highlight challenges and opportunities for plant science. We provide plant-specific application examples to guide researchers in selecting methodologies that suit their particular research questions. Given the interdisciplinary nature of gene regulatory network inference, we tried to cater to both biologists and computer scientists to help them engage in a dialogue about concepts and caveats in network inference. Specifically, we discuss problems and opportunities in heterogeneous data integration for eukaryotic organisms and common caveats to be considered during network model evaluation. This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer.

  20. Disease gene identification strategies for exome sequencing

    PubMed Central

    Gilissen, Christian; Hoischen, Alexander; Brunner, Han G; Veltman, Joris A

    2012-01-01

    Next generation sequencing can be used to search for Mendelian disease genes in an unbiased manner by sequencing the entire protein-coding sequence, known as the exome, or even the entire human genome. Identifying the pathogenic mutation amongst thousands to millions of genomic variants is a major challenge, and novel variant prioritization strategies are required. The choice of these strategies depends on the availability of well-phenotyped patients and family members, the mode of inheritance, the severity of the disease and its population frequency. In this review, we discuss the current strategies for Mendelian disease gene identification by exome resequencing. We conclude that exome strategies are successful and identify new Mendelian disease genes in approximately 60% of the projects. Improvements in bioinformatics as well as in sequencing technology will likely increase the success rate even further. Exome sequencing is likely to become the most commonly used tool for Mendelian disease gene identification for the coming years. PMID:22258526

  1. Regulatory Genes Controlling Anthocyanin Pigmentation Are Functionally Conserved among Plant Species and Have Distinct Sets of Target Genes.

    PubMed Central

    Quattrocchio, F; Wing, JF; Leppen, H; Mol, J; Koes, RE

    1993-01-01

    In this study, we demonstrate that in petunia at least four regulatory genes (anthocyanin-1 [an1], an2, an4, and an11) control transcription of a subset of structural genes from the anthocyanin pathway by using a combination of RNA gel blot analysis, transcription run-on assays, and transient expression assays. an2- and an11- mutants could be transiently complemented by the maize regulatory genes Leaf color (Lc) or Colorless-1 (C1), respectively, whereas an1- mutants only by Lc and C1 together. In addition, the combination of Lc and C1 induces pigment accumulation in young leaves. This indicates that Lc and C1 are both necessary and sufficient to produce pigmentation in leaf cells. Regulatory pigmentation genes in maize and petunia control different sets of structural genes. The maize Lc and C1 genes expressed in petunia differentially activate the promoters of the chalcone synthase genes chsA and chsJ in the same way that the homologous petunia genes do. This suggests that the regulatory proteins in both species are functionally similar and that the choice of target genes is determined by their promoter sequences. We present an evolutionary model that explains the differences in regulation of pigmentation pathways of maize, petunia, and snapdragon. PMID:12271045

  2. Phenotype accessibility and noise in random threshold gene regulatory networks.

    PubMed

    Pinho, Ricardo; Garcia, Victor; Feldman, Marcus W

    2014-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes.

  3. Genome-wide identification of regulatory elements and reconstruction of gene regulatory networks of the green alga Chlamydomonas reinhardtii under carbon deprivation.

    PubMed

    Winck, Flavia Vischi; Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

    2013-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can

  4. Initial deployment of the cardiogenic gene regulatory network in the basal chordate, Ciona intestinalis.

    PubMed

    Woznica, Arielle; Haeussler, Maximilian; Starobinska, Ella; Jemmett, Jessica; Li, Younan; Mount, David; Davidson, Brad

    2012-08-01

    The complex, partially redundant gene regulatory architecture underlying vertebrate heart formation has been difficult to characterize. Here, we dissect the primary cardiac gene regulatory network in the invertebrate chordate, Ciona intestinalis. The Ciona heart progenitor lineage is first specified by Fibroblast Growth Factor/Map Kinase (FGF/MapK) activation of the transcription factor Ets1/2 (Ets). Through microarray analysis of sorted heart progenitor cells, we identified the complete set of primary genes upregulated by FGF/Ets shortly after heart progenitor emergence. Combinatorial sequence analysis of these co-regulated genes generated a hypothetical regulatory code consisting of Ets binding sites associated with a specific co-motif, ATTA. Through extensive reporter analysis, we confirmed the functional importance of the ATTA co-motif in primary heart progenitor gene regulation. We then used the Ets/ATTA combination motif to successfully predict a number of additional heart progenitor gene regulatory elements, including an intronic element driving expression of the core conserved cardiac transcription factor, GATAa. This work significantly advances our understanding of the Ciona heart gene network. Furthermore, this work has begun to elucidate the precise regulatory architecture underlying the conserved, primary role of FGF/Ets in chordate heart lineage specification.

  5. Robustness and Accuracy in Sea Urchin Developmental Gene Regulatory Networks.

    PubMed

    Ben-Tabou de-Leon, Smadar

    2016-01-01

    Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change.

  6. Ethanol utilization regulatory protein: profile alignments give no evidence of origin through aldehyde and alcohol dehydrogenase gene fusion.

    PubMed Central

    Nicholas, H. B.; Persson, B.; Jörnvall, H.; Hempel, J.

    1995-01-01

    The suggestion that the ethanol regulatory protein from Aspergillus has its evolutionary origin in a gene fusion between aldehyde and alcohol dehydrogenase genes (Hawkins AR, Lamb HK, Radford A, Moore JD, 1994, Gene 146:145-158) has been tested by profile analysis with aldehyde and alcohol dehydrogenase family profiles. We show that the degree and kind of similarity observed between these profiles and the ethanol regulatory protein sequence is that expected from random sequences of the same composition. This level of similarity fails to support the suggested gene fusion. PMID:8580855

  7. Control of Hoxd gene transcription in the mammary bud by hijacking a preexisting regulatory landscape

    PubMed Central

    Schep, Ruben; Necsulea, Anamaria; Rodríguez-Carballo, Eddie; Guerreiro, Isabel; Andrey, Guillaume; Nguyen Huynh, Thi Hanh; Marcet, Virginie; Zákány, Jozsef; Duboule, Denis; Beccari, Leonardo

    2016-01-01

    Vertebrate Hox genes encode transcription factors operating during the development of multiple organs and structures. However, the evolutionary mechanism underlying this remarkable pleiotropy remains to be fully understood. Here, we show that Hoxd8 and Hoxd9, two genes of the HoxD complex, are transcribed during mammary bud (MB) development. However, unlike in other developmental contexts, their coexpression does not rely on the same regulatory mechanism. Hoxd8 is regulated by the combined activity of closely located sequences and the most distant telomeric gene desert. On the other hand, Hoxd9 is controlled by an enhancer-rich region that is also located within the telomeric gene desert but has no impact on Hoxd8 transcription, thus constituting an exception to the global regulatory logic systematically observed at this locus. The latter DNA region is also involved in Hoxd gene regulation in other contexts and strongly interacts with Hoxd9 in all tissues analyzed thus far, indicating that its regulatory activity was already operational before the appearance of mammary glands. Within this DNA region and neighboring a strong limb enhancer, we identified a short sequence conserved in therian mammals and capable of enhancer activity in the MBs. We propose that Hoxd gene regulation in embryonic MBs evolved by hijacking a preexisting regulatory landscape that was already at work before the emergence of mammals in structures such as the limbs or the intestinal tract. PMID:27856734

  8. Experimental approaches for gene regulatory network construction: the chick as a model system

    PubMed Central

    Streit, Andrea; Tambalo, Monica; Chen, Jingchen; Grocott, Timothy; Anwar, Maryam; Sosinsky, Alona; Stern, Claudio D.

    2012-01-01

    Setting up the body plan during embryonic development requires the coordinated action of many signals and transcriptional regulators in a precise temporal sequence and spatial pattern. The last decades have seen an explosion of information describing the molecular control of many developmental processes. The next challenge is to integrate this information into logic ‘wiring diagrams’ that visualise gene actions and outputs, have predictive power and point to key control nodes. Here we provide an experimental workflow on how to construct gene regulatory networks using the chick as model system. Keywords: transcription factors, transcriptome analysis, conserved regulatory elements PMID:23174848

  9. Hepatoma cell-specific ganciclovir-mediated toxicity of a lentivirally transduced HSV-TkEGFP fusion protein gene placed under the control of rat alpha-fetoprotein gene regulatory sequences.

    PubMed

    Uch, Rathviro; Gérolami, René; Faivre, Jamila; Hardwigsen, Jean; Mathieu, Sylvie; Mannoni, Patrice; Bagnis, Claude

    2003-09-01

    Suicide gene therapy combining herpes simplex virus thymidine kinase gene transfer and ganciclovir administration can be envisioned as a powerful therapeutical approach in the treatment of hepatocellular carcinoma; however, safety issues regarding transgene expression in parenchyma cells have to be addressed. In this study, we constructed LATKW, a lentiviral vector expressing the HSV-TkEGFP gene placed under the control of the promoter elements that control the expression of the rat alpha-fetoprotein, and assayed its specific expression in vitro in hepatocarcinoma and nonhepatocarcinoma human cell lines, and in epidermal growth factor stimulated human primary hepatocytes. Using LATKW, a strong expression of the transgene was found in transduced hepatocarcinoma cells compared to a very low expression in nonhepatocarcinoma human cell lines, as assessed by Northern blot, RT-PCR, FACS analysis and ganciclovir-mediated toxicity assay, and no expression was found in lentivirally transduced normal human hepatocytes. Altogether, these results demonstrate the possibility to use a lentivirally transduced expression unit containing the rat alpha-fetoprotein promoter to restrict the HSV-TK-mediated induced GCV sensitivity to human hepatocarcinoma cells.

  10. The companions: regulatory T cells and gene therapy

    PubMed Central

    Eghtesad, Saman; Morel, Penelope A; Clemens, Paula R

    2009-01-01

    Undesired immunological responses to products of therapeutic gene replacement have been obstacles to successful gene therapy. Understanding such responses of the host immune system to achieve immunological tolerance to a transferred gene product is therefore crucial. In this article, we review relevant studies of immunological responses to gene replacement therapy, the role of immunological tolerance mediated by regulatory T cells in down-regulating the unwanted immune responses, and the interrelationship of the two topics. PMID:19368560

  11. Emerging role of regulatory T cells in gene transfer.

    PubMed

    Cao, Ou; Furlan-Freguia, Christian; Arruda, Valder R; Herzog, Roland W

    2007-10-01

    Induction and maintenance of immune tolerance to therapeutic transgene products are key requirements for successful gene replacement therapies. Gene transfer may also be used to specifically induce immune tolerance and thereby augment other types of therapies. Similarly, gene therapies for treatment of autoimmune diseases are being developed in order to restore tolerance to self-antigens. Regulatory T cells have emerged as key players in many aspects of immune tolerance, and a rapidly increasing body of work documents induction and/or activation of regulatory T cells by gene transfer. Regulatory T cells may suppress antibody formation and cytotoxic T cell responses and may be critical for immune tolerance to therapeutic proteins. In this regard, CD4(+)CD25(+) regulatory T cells have been identified as important components of tolerance in several gene transfer protocols, including hepatic in vivo gene transfer. Augmentation of regulatory T cell responses should be a promising new tool to achieve tolerance and avoid immune-mediated rejection of gene therapy. During the past decade, it has become obvious that immune regulation is an important and integral component of tolerance to self-antigens and of many forms of induced tolerance. Gene therapy can only be successful if the immune system does not reject the therapeutic transgene product. Recent studies provide a rapidly growing body of evidence that regulatory T cells (T(reg)) are involved and often play a crucial role in tolerance to proteins expressed by means of gene transfer. This review seeks to provide an overview of these data and their implications for gene therapy.

  12. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data

    PubMed Central

    2013-01-01

    Background High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. Results We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. Conclusions We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments. PMID:24053776

  13. The molecular and gene regulatory signature of a neuron

    PubMed Central

    Hobert, Oliver; Carrera, Inés; Stefanakis, Nikolaos

    2010-01-01

    Neuron-type specific gene batteries define the morphological and functional diversity of cell types in the nervous system. Here, we discuss the composition of neuron-type specific gene batteries and illustrate gene regulatory strategies employed by distinct organisms from C.elegans to higher vertebrates, which are instrumental in determining the unique gene expression profile and molecular composition of individual neuronal cell types. Based on principles learned from prokaryotic gene regulation, we argue that neuronal, terminal gene batteries are functionally grouped into parallel acting “regulons”. The theoretical concepts discussed here provide testable hypotheses for future experimental analysis into the exact gene regulatory mechanisms that are employed in the generation of neuronal diversity and identity. PMID:20663572

  14. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  15. Gene regulatory networks modelling using a dynamic evolutionary hybrid

    PubMed Central

    2010-01-01

    Background Inference of gene regulatory networks is a key goal in the quest for understanding fundamental cellular processes and revealing underlying relations among genes. With the availability of gene expression data, computational methods aiming at regulatory networks reconstruction are facing challenges posed by the data's high dimensionality, temporal dynamics or measurement noise. We propose an approach based on a novel multi-layer evolutionary trained neuro-fuzzy recurrent network (ENFRN) that is able to select potential regulators of target genes and describe their regulation type. Results The recurrent, self-organizing structure and evolutionary training of our network yield an optimized pool of regulatory relations, while its fuzzy nature avoids noise-related problems. Furthermore, we are able to assign scores for each regulation, highlighting the confidence in the retrieved relations. The approach was tested by applying it to several benchmark datasets of yeast, managing to acquire biologically validated relations among genes. Conclusions The results demonstrate the effectiveness of the ENFRN in retrieving biologically valid regulatory relations and providing meaningful insights for better understanding the dynamics of gene regulatory networks. The algorithms and methods described in this paper have been implemented in a Matlab toolbox and are available from: http://bioserver-1.bioacademy.gr/DataRepository/Project_ENFRN_GRN/. PMID:20298548

  16. Characterization of the Cis-Regulatory Region of the Drosophila Homeotic Gene Sex Combs Reduced

    PubMed Central

    Gindhart-Jr., J. G.; King, A. N.; Kaufman, T. C.

    1995-01-01

    The Drosophila homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. Interestingly, regulatory sequences that direct reporter gene expression in an Scr-like pattern in the anterior and posterior midgut are imbedded in the regulatory region of the segmentation gene fushi tarazu (ftz), which is normally located

  17. Candidate regulatory sequence elements for cell cycle-dependent transcription in Saccharomyces cerevisiae.

    PubMed

    Wolfsberg, T G; Gabrielian, A E; Campbell, M J; Cho, R J; Spouge, J L; Landsman, D

    1999-08-01

    Recent developments in genome-wide transcript monitoring have led to a rapid accumulation of data from gene expression studies. Such projects highlight the need for methods to predict the molecular basis of transcriptional coregulation. A microarray project identified the 420 yeast transcripts whose synthesis displays cell cycle-dependent periodicity. We present here a statistical technique we developed to identify the sequence elements that may be responsible for this cell cycle regulation. Because most gene regulatory sites contain a short string of highly conserved nucleotides, any such strings that are involved in gene regulation will occur frequently in the upstream regions of the genes that they regulate, and rarely in the upstream regions of other genes. Our strategy therefore utilizes statistical procedures to identify short oligomers, five or six nucleotides in length, that are over-represented in upstream regions of genes whose expression peaks at the same phase of the cell cycle. We report, with a high level of confidence, that 9 hexamers and 12 pentamers are over-represented in the upstream regions of genes whose expression peaks at the early G(1), late G(1), S, G(2), or M phase of the cell cycle. Some of these sequence elements show a preference for a particular orientation, and others, through a separate statistical test, for a particular position upstream of the ATG start codon. The finding that the majority of the statistically significant sequence elements are located in late G(1) upstream regions correlates with other experiments that identified the late G(1)/early S boundary as a vital cell cycle control point. Our results highlight the importance of MCB, an element implicated previously in late G(1)/early S gene regulation, as most of the late G(1) oligomers contain the MCB sequence or variations thereof. It is striking that most MCB-like sequences localize to a specific region upstream of the ATG start codon. Additional sequences that we have

  18. Time-Delayed Models of Gene Regulatory Networks

    PubMed Central

    Parmar, K.; Blyuss, K. B.; Kyrychko, Y. N.; Hogan, S. J.

    2015-01-01

    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems. PMID:26576197

  19. Estimating Gene Regulatory Networks with pandaR.

    PubMed

    Schlauch, Daniel; Paulson, Joseph N; Young, Albert; Glass, Kimberly; Quackenbush, John

    2017-03-11

    PANDA (Passing Attributes betweenNetworks forData Assimilation) is a gene regulatory network inference method that begins with amodel of transcription factor-target gene interactions and usesmessage passing to update the network model given available transcriptomic and protein-protein interaction data. PANDA is used to estimate networks for each experimental group and the network models are then compared between groups to explore transcriptional processes that distinguish the groups. We present pandaR (bioconductor.org/packages/pandaR), a Bioconductor package that implements PANDA and provides a framework for exploratory data analysis on gene regulatory networks.

  20. Characterization of the cis-regulatory region of the Drosophila homeotic gene Sex combs reduced

    SciTech Connect

    Gindhart, J.G. Jr.; King, N.A.; Kaufman, T.C.

    1995-02-01

    The Drosophilia homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. This analysis provides an entry point for the study of how Scr transcription is regulated at the molecular level. 60 refs., 7 figs., 1 tab.

  1. Systems Approaches to Identifying Gene Regulatory Networks in Plants

    PubMed Central

    Long, Terri A.; Brady, Siobhan M.; Benfey, Philip N.

    2009-01-01

    Complex gene regulatory networks are composed of genes, noncoding RNAs, proteins, metabolites, and signaling components. The availability of genome-wide mutagenesis libraries; large-scale transcriptome, proteome, and metabalome data sets; and new high-throughput methods that uncover protein interactions underscores the need for mathematical modeling techniques that better enable scientists to synthesize these large amounts of information and to understand the properties of these biological systems. Systems biology approaches can allow researchers to move beyond a reductionist approach and to both integrate and comprehend the interactions of multiple components within these systems. Descriptive and mathematical models for gene regulatory networks can reveal emergent properties of these plant systems. This review highlights methods that researchers are using to obtain large-scale data sets, and examples of gene regulatory networks modeled with these data. Emergent properties revealed by the use of these network models and perspectives on the future of systems biology are discussed. PMID:18616425

  2. High regulatory gene use in sea urchin embryogenesis: Implications for bilaterian development and evolution.

    PubMed

    Howard-Ashby, Meredith; Materna, Stefan C; Brown, C Titus; Tu, Qiang; Oliveri, Paola; Cameron, R Andrew; Davidson, Eric H

    2006-12-01

    A global scan of transcription factor usage in the sea urchin embryo was carried out in the context of the Strongylocentrotus purpuratus genome sequencing project, and results from six individual studies are here considered. Transcript prevalence data were obtained for over 280 regulatory genes encoding sequence-specific transcription factors of every known family, but excluding genes encoding zinc finger proteins. This is a statistically inclusive proxy for the total "regulome" of the sea urchin genome. Close to 80% of the regulome is expressed at significant levels by the late gastrula stage. Most regulatory genes must be used repeatedly for different functions as development progresses. An evolutionary implication is that animal complexity at the stage when the regulome first evolved was far simpler than even the last common bilaterian ancestor, and is thus of deep antiquity.

  3. Cis-regulatory elements are harbored in Intron5 of the RUNX1 gene

    PubMed Central

    2014-01-01

    Background Human RUNX1 gene is one of the most frequent target for chromosomal translocations associated with acute myeloid leukemia (AML) and acute lymphoid leukemia (ALL). The highest prevalence in AML is noted with (8; 21) translocation; which represents 12 to 15% of all AML cases. Interestingly, all the breakpoints mapped to date in t(8;21) are clustered in intron 5 of the RUNX1 gene and intron 1 of the ETO gene. No homologous sequences have been found at the recombination regions; but DNase I hypersensitive sites (DHS) have been mapped to the areas of the genes involved in t(8;21). Presence of DHS sites is commonly associated with regulatory elements such as promoters, enhancers and silencers, among others. Results In this study we used a combination of comparative genomics, cloning and transfection assays to evaluate potential regulatory elements located in intron 5 of the RUNX1 gene. Our genomic analysis identified nine conserved non-coding sequences that are evolutionarily conserved among rat, mouse and human. We cloned two of these regions in pGL-3 Promoter plasmid in order to analyze their transcriptional regulatory activity. Our results demonstrate that the identified regions can indeed regulate transcription of a reporter gene in a distance and position independent manner; moreover, their transcriptional effect is cell type specific. Conclusions We have identified nine conserved non coding sequence that are harbored in intron 5 of the RUNX1 gene. We have also demonstrated that two of these regions can regulate transcriptional activity in vitro. Taken together our results suggest that intron 5 of the RUNX1 gene contains multiple potential cis-regulatory elements. PMID:24655352

  4. Gene regulatory network inference using out of equilibrium statistical mechanics

    PubMed Central

    Benecke, Arndt

    2008-01-01

    Spatiotemporal control of gene expression is fundamental to multicellular life. Despite prodigious efforts, the encoding of gene expression regulation in eukaryotes is not understood. Gene expression analyses nourish the hope to reverse engineer effector-target gene networks using inference techniques. Inference from noisy and circumstantial data relies on using robust models with few parameters for the underlying mechanisms. However, a systematic path to gene regulatory network reverse engineering from functional genomics data is still impeded by fundamental problems. Recently, Johannes Berg from the Theoretical Physics Institute of Cologne University has made two remarkable contributions that significantly advance the gene regulatory network inference problem. Berg, who uses gene expression data from yeast, has demonstrated a nonequilibrium regime for mRNA concentration dynamics and was able to map the gene regulatory process upon simple stochastic systems driven out of equilibrium. The impact of his demonstration is twofold, affecting both the understanding of the operational constraints under which transcription occurs and the capacity to extract relevant information from highly time-resolved expression data. Berg has used his observation to predict target genes of selected transcription factors, and thereby, in principle, demonstrated applicability of his out of equilibrium statistical mechanics approach to the gene network inference problem. PMID:19404429

  5. Regulatory region with putA gene of proline dehydrogenase that links to the lum and the lux operons in Photobacterium leiognathi.

    PubMed

    Lin, J W; Yu, K Y; Chen, H Y; Weng, S F

    1996-02-27

    Nucleotide sequence of regulatory region (R & R) with putA gene (EMBL Accession No. U39227) from Photobacterium leiognathi PL741 has been determined, and the putA gene encoded amino acid sequence of proline dehydrogenase is deduced. Alignment and comparison of proline dehydrogenase of P. leiognathi with the proline dehydrogenase domain in the PutA protein of Escherichia coli and Salmonella typhimurium show that they are homologous. Nucleotide sequence reveals that regulatory region with the putA gene is linked to the lum and lux operons in genome; the gene order is <--putA--R & R(I)<--ter-lumQ-lumP-R & R-luxC-luxD-luxA-luxB-luxE--> (R & R: regulatory region; ter:transcriptional terminator), whereas the R & R is the regulatory region for the lum and the lux operons, ter is the transcriptional terminator for the lum operon, and R & R(I) apparently is the regulatory region for the putA and related genes. Nucleotide sequence analysis illustrates the specific inverted repeat (SIR), cAMP-CRP consensus sequence, canonical -10/-35 promoter, putative operator and Shine-Dalgarno (SD) sequence on the regulatory region R & R(I) for the putA and related genes; it suggests that the putA and related genes are simply linked to the lum and the lux operons in genome, the regulatory region R & R(I) is independent for the putA and related genes.

  6. Portrait of Candida Species Biofilm Regulatory Network Genes.

    PubMed

    Araújo, Daniela; Henriques, Mariana; Silva, Sónia

    2017-01-01

    Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis.

  7. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php.

  8. A Maize Gene Regulatory Network for Phenolic Metabolism.

    PubMed

    Yang, Fan; Li, Wei; Jiang, Nan; Yu, Haidong; Morohashi, Kengo; Ouma, Wilberforce Zachary; Morales-Mantilla, Daniel E; Gomez-Cano, Fabio Andres; Mukundi, Eric; Prada-Salcedo, Luis Daniel; Velazquez, Roberto Alers; Valentin, Jasmin; Mejía-Guerra, Maria Katherine; Gray, John; Doseff, Andrea I; Grotewold, Erich

    2017-03-06

    The translation of the genotype into phenotype, represented for example by the expression of genes encoding enzymes required for the biosynthesis of phytochemicals that are important for interaction of plants with the environment, is largely carried out by transcription factors (TFs) that recognize specific cis-regulatory elements in the genes that they control. TFs and their target genes are organized in gene regulatory networks (GRNs), and thus uncovering GRN architecture presents an important biological challenge necessary to explain gene regulation. Linking TFs to the genes they control, central to understanding GRNs, can be carried out using gene- or TF-centered approaches. In this study, we employed a gene-centered approach utilizing the yeast one-hybrid assay to generate a network of protein-DNA interactions that participate in the transcriptional control of genes involved in the biosynthesis of maize phenolic compounds including general phenylpropanoids, lignins, and flavonoids. We identified 1100 protein-DNA interactions involving 54 phenolic gene promoters and 568 TFs. A set of 11 TFs recognized 10 or more promoters, suggesting a role in coordinating pathway gene expression. The integration of the gene-centered network with information derived from TF-centered approaches provides a foundation for a phenolics GRN characterized by interlaced feed-forward loops that link developmental regulators with biosynthetic genes.

  9. Sequence diversity in 36 candidate genes for cardiovascular disorders.

    PubMed Central

    Cambien, F; Poirier, O; Nicaud, V; Herrmann, S M; Mallet, C; Ricard, S; Behague, I; Hallet, V; Blanc, H; Loukaci, V; Thillet, J; Evans, A; Ruidavets, J B; Arveiler, D; Luc, G; Tiret, L

    1999-01-01

    Two strategies involving whole-genome association studies have been proposed for the identification of genes involved in complex diseases. The first one seeks to characterize all common variants of human genes and to test their association with disease. The second one seeks to develop dense maps of single-nucleotide polymorphisms (SNPs) and to detect susceptibility genes through linkage disequilibrium. We performed a molecular screening of the coding and/or flanking regions of 36 candidate genes for cardiovascular diseases. All polymorphisms identified by this screening were further genotyped in 750 subjects of European descent. In the whole set of genes, the lengths explored spanned 53.8 kb in the 5' regions, 68.4 kb in exonic regions, and 13 kb in the 3' regions. The strength of linkage disequilibrium within candidate regions suggests that genomewide maps of SNPs might be efficient ways to identify new disease-susceptibility genes, provided that the maps are sufficiently dense. However, the relatively large number of polymorphisms within coding and regulatory regions of candidate genes raises the possibility that several of them might be functional and that the pattern of genotype-phenotype association might be more complex than initially envisaged, as actually has been observed in some well-characterized genes. These results argue in favor of both genomewide association studies and detailed studies of the overall sequence variation of candidate genes, as complementary approaches. PMID:10364531

  10. Understanding the Role of Housekeeping and Stress-Related Genes in Transcription-Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Heath, Allison; Kavraki, Lydia; Balázsi, Gábor

    2008-03-01

    Despite the increasing number of completely sequenced genomes, much remains to be learned about how living cells process environmental information and respond to changes in their surroundings. Accumulating evidence indicates that eukaryotic and prokaryotic genes can be classified in two distinct categories that we will call class I and class II. Class I genes are housekeeping genes, often characterized by stable, noise resistant expression levels. In contrast, class II genes are stress-related genes and often have noisy, unstable expression levels. In this work we analyze the large scale transcription-regulatory networks (TRN) of E. coli and S. cerevisiae and preliminary data on H. sapien. We find that stable, housekeeping genes (class I) are preferentially utilized as transcriptional inputs while stress related, unstable genes (class II) are utilized as transcriptional integrators. This might be the result of convergent evolution that placed the appropriate genes in the appropriate locations within transcriptional networks according to some fundamental principles that govern cellular information processing.

  11. Nucleotide sequence and functional analysis of regulatory region of the lumP and the lux operon from Photobacterium leiognathi.

    PubMed

    Lin, J W; Chao, Y F; Weng, S F

    1995-05-25

    The lumP gene is linked to the lux operon, but runs in the opposite direction in Photobacterium leiognathi PL741. The gene order of the lumP and the lux operon is < -lumP-R & R-luxC-luxD-luxA-luxB-luxN-luxE- > (R & R: regulatory region). The nucleotide sequence of the regulatory region (827-bp) between the lumP and the lux operon was determined. Sequence analysis illustrates that the regulatory region includes two divergent promoter systems, PR-promoter system for the lux operon (R-operon) and PL-promoter system for the lumP or lum operon (L-operon). Functional analysis of the regulatory region shows that the PR- and PL-promoter systems both are able to lead the gene expression. The deletion experiment result elicits that the PR- and PL-promoter are coordinatively and negatively regulated; the PR- and PL-promoter might be competing for recognition by RNA polymerase to initiate transcription. The fact of the LumP responsible for the spectral blue shift in P. leiognathi implied that the lumP gene closedly linked to the lux operon is for coordinative regulation with the lux operon. In addition, the glucose repression on the PR-promoter system shows that the expression of the lux operon is regulated by cAMP-CRP induction in E. coli.

  12. Role of Conserved Non-Coding Regulatory Elements in LMW Glutenin Gene Expression

    PubMed Central

    Juhász, Angéla; Makai, Szabolcs; Sebestyén, Endre; Tamás, László; Balázs, Ervin

    2011-01-01

    Transcriptional regulation of LMW glutenin genes were investigated in-silico, using publicly available gene sequences and expression data. Genes were grouped into different LMW glutenin types and their promoter profiles were determined using cis-acting regulatory elements databases and published results. The various cis-acting elements belong to some conserved non-coding regulatory regions (CREs) and might act in two different ways. There are elements, such as GCN4 motifs found in the long endosperm box that could serve as key factors in tissue-specific expression. Some other elements, such as the AACA/TA motifs or the individual prolamin box variants, might modulate the level of expression. Based on the promoter sequences and expression characteristic LMW glutenin genes might be transcribed following two different mechanisms. Most of the s- and i-type genes show a continuously increasing expression pattern. The m-type genes, however, demonstrate normal distribution in their expression profiles. Differences observed in their expression could be related to the differences found in their promoter sequences. Polymorphisms in the number and combination of cis-acting elements in their promoter regions can be of crucial importance in the diverse levels of production of single LMW glutenin gene types. PMID:22242127

  13. Regulatory links between imprinted genes: evolutionary predictions and consequences.

    PubMed

    Patten, Manus M; Cowley, Michael; Oakey, Rebecca J; Feil, Robert

    2016-02-10

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species.

  14. Regulatory links between imprinted genes: evolutionary predictions and consequences

    PubMed Central

    Patten, Manus M.; Cowley, Michael; Oakey, Rebecca J.; Feil, Robert

    2016-01-01

    Genomic imprinting is essential for development and growth and plays diverse roles in physiology and behaviour. Imprinted genes have traditionally been studied in isolation or in clusters with respect to cis-acting modes of gene regulation, both from a mechanistic and evolutionary point of view. Recent studies in mammals, however, reveal that imprinted genes are often co-regulated and are part of a gene network involved in the control of cellular proliferation and differentiation. Moreover, a subset of imprinted genes acts in trans on the expression of other imprinted genes. Numerous studies have modulated levels of imprinted gene expression to explore phenotypic and gene regulatory consequences. Increasingly, the applied genome-wide approaches highlight how perturbation of one imprinted gene may affect other maternally or paternally expressed genes. Here, we discuss these novel findings and consider evolutionary theories that offer a rationale for such intricate interactions among imprinted genes. An evolutionary view of these trans-regulatory effects provides a novel interpretation of the logic of gene networks within species and has implications for the origin of reproductive isolation between species. PMID:26842569

  15. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    SciTech Connect

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  16. Dynamics of gene regulatory networks with cell division cycle

    NASA Astrophysics Data System (ADS)

    Chen, Luonan; Wang, Ruiqi; Kobayashi, Tetsuya J.; Aihara, Kazuyuki

    2004-07-01

    This paper focuses on modeling and analyzing the nonlinear dynamics of gene regulatory networks with the consideration of a cell division cycle with duplication process of DNA , in particular for switches and oscillators of synthetic networks. We derive two models that may correspond to the eukaryotic and prokaryotic cells, respectively. A biologically plausible three-gene model ( lac,tetR , and cI ) and a repressilator as switch and oscillator examples are used to illustrate our theoretical results. We show that the cell cycle may play a significant role in gene regulation due to the nonlinear dynamics of a gene regulatory network although gene expressions are usually tightly controlled by transcriptional factors.

  17. Engineering a regulatory region of jadomycin gene cluster to improve jadomycin B production in Streptomyces venezuelae.

    PubMed

    Zheng, Jian-Ting; Wang, Sheng-Lan; Yang, Ke-Qian

    2007-09-01

    Streptomyces venezuelae ISP5230 produces a group of jadomycin congeners with cytotoxic activities. To improve jadomycin fermentation process, a genetic engineering strategy was designed to replace a 3.4-kb regulatory region of jad gene cluster that contains four regulatory genes (3' end 272 bp of jadW2, jadW3, jadR2, and jadR1) and the native promoter upstream of jadJ (P(J)) with the ermEp* promoter sequence so that ermEp* drives the expression of the jadomycin biosynthetic genes from jadJ in the engineered strain. As expected, the mutant strain produced jadomycin B without ethanol treatment, and the yield increased to about twofold that of the stressed wild-type. These results indicated that manipulation of the regulation of a biosynthetic gene cluster is an effective strategy to increase product yield.

  18. Cloning and Sequencing the First HLA Gene

    PubMed Central

    Jordan, Bertrand R.

    2010-01-01

    This Perspectives article recounts the isolation and sequencing of the first human histocompatibility gene (HLA) in 1980–1981. At the time, general knowledge of the molecules of the immune system was already fairly extensive, and gene rearrangements in the immunoglobulin complex (discovered in 1976) had generated much excitement: HLA was quite obviously the next frontier. The author was able to use a homologous murine H-2 cDNA to identify putative human HLA genomic clones in a λ-phage library and thus to isolate and sequence the first human histocompatibility gene. This personal account relates the steps that led to this result, describes the highly competitive international environment, and highlights the role of location, connections, and sheer luck in such an achievement. It also puts this work in perspective with a short description of the current knowledge of histocompatibility genes and, finally, presents some reflections on the meaning of “discovery.” PMID:20457890

  19. A regulatory gene (ECO-orf4) required for ECO-0501 biosynthesis in Amycolatopsis orientalis.

    PubMed

    Shen, Yang; Huang, He; Zhu, Li; Luo, Minyu; Chen, Daijie

    2014-02-01

    ECO-0501 is a novel linear polyene antibiotic, which was discovered from Amycolatopsis orientalis. Recent study of ECO-0501 biosynthesis pathway revealed the presence of regulatory gene: ECO-orf4. The A. orientalis ECO-orf4 gene from the ECO-0501 biosynthesis cluster was analyzed, and its deduced protein (ECO-orf4) was found to have amino acid sequence homology with large ATP-binding regulators of the LuxR (LAL) family regulators. Database comparison revealed two hypothetical domains, a LuxR-type helix-turn-helix (HTH) DNA binding motif near the C-terminal and an N-terminal nucleotide triphosphate (NTP) binding motif included. Deletion of the corresponding gene (ECO-orf4) resulted in complete loss of ECO-0501 production. Complementation by one copy of intact ECO-orf4 restored the polyene biosynthesis demonstrating that ECO-orf4 is required for ECO-0501 biosynthesis. The results of overexpression ECO-orf4 on ECO-0501 production indicated that it is a positive regulatory gene. Gene expression analysis by reverse transcription PCR of the ECO-0501 gene cluster showed that the transcription of ECO-orf4 correlates with that of genes involved in polyketide biosynthesis. These results demonstrated that ECO-orf4 is a pathway-specific positive regulatory gene that is essential for ECO-0501 biosynthesis.

  20. Variable neighborhood search for reverse engineering of gene regulatory networks.

    PubMed

    Nicholson, Charles; Goodwin, Leslie; Clark, Corey

    2017-01-01

    A new search heuristic, Divided Neighborhood Exploration Search, designed to be used with inference algorithms such as Bayesian networks to improve on the reverse engineering of gene regulatory networks is presented. The approach systematically moves through the search space to find topologies representative of gene regulatory networks that are more likely to explain microarray data. In empirical testing it is demonstrated that the novel method is superior to the widely employed greedy search techniques in both the quality of the inferred networks and computational time.

  1. Gene regulatory networks elucidating Huanglongbing disease mechanisms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional...

  2. Direct interaction of the Polycomb protein with Antennapedia regulatory sequences in polytene chromosomes of Drosophila melanogaster.

    PubMed Central

    Zink, B; Engström, Y; Gehring, W J; Paro, R

    1991-01-01

    The Polycomb (Pc) gene is responsible for the elaboration and maintenance of the expression pattern of the homeotic genes during development of Drosophila. In mutant Pc- embryos, homeotic transcripts are ectopically expressed, leading to abdominal transformations in all segments. From this it was suggested that PC+ acts as a repressor of homeotic gene transcription. We have mapped the cis-acting control sequences of the homeotic Antennapedia (Antp) gene regulated by Pc. Using Antp P1 and P2 promoter fragments linked to the E. coli lacZ reporter gene we show different expression patterns of beta-galactosidase (beta-gal) in transformed Pc+ and Pc- embryos. In addition we are able to visualize by immunocytochemical techniques on polytene chromosomes the direct binding of the Pc protein to the transposed cis-regulatory promoter fragments. However, short Antp P1 promoter constructs which are--due to position effects--ectopically activated in salivary glands, do not reveal a Pc binding signal. Images PMID:1671215

  3. [Regulatory functions of Pax gene family in Drosophila development].

    PubMed

    Li, Li; Yang, Yang; Xue, Lei

    2010-02-01

    The Pax gene family encodes a group of important transcription factors that have been evolutionary conserved from Drosophila to human. Pax genes play pivotal roles in regulating diverse signal transduction pathways and organogenesis during embryonic development through modulating cell proliferation and self-renewal, embryonic precursor cell migration, and the coordination of specific differentiation programs. Ten members of the Pax gene family, which perform crucial regulatory functions during embryonic and postembryonic development, have been identified in Drosophila. In this report, we described the protein structures, expression patterns, and main functions of Drosophila Pax genes.

  4. The immunogenicity of viral haemorragic septicaemia rhabdovirus (VHSV) DNA vaccines can depend on plasmid regulatory sequences.

    PubMed

    Chico, V; Ortega-Villaizan, M; Falco, A; Tafalla, C; Perez, L; Coll, J M; Estepa, A

    2009-03-18

    A plasmid DNA encoding the viral hemorrhagic septicaemia virus (VHSV)-G glycoprotein under the control of 5' sequences (enhancer/promoter sequence plus both non-coding 1st exon and 1st intron sequences) from carp beta-actin gene (pAE6-G(VHSV)) was compared to the vaccine plasmid usually described the gene expression is regulated by the human cytomegalovirus (CMV) immediate-early promoter (pMCV1.4-G(VHSV)). We observed that these two plasmids produced a markedly different profile in the level and time of expression of the encoded-antigen, and this may have a direct effect upon the intensity and suitability of the in vivo immune response. Thus, fish genetic immunisation assays were carried out to study the immune response of both plasmids. A significantly enhanced specific-antibody response against the viral glycoprotein was found in the fish immunised with pAE6-G(VHSV). However, the protective efficacy against VHSV challenge conferred by both plasmids was similar. Later analysis of the transcription profile of a set of representative immune-related genes in the DNA immunized fish suggested that depending on the plasmid-related regulatory sequences controlling its expression, the plasmid might activate distinct patterns of the immune system. All together, the results from this study mainly point out that the selection of a determinate encoded-antigen/vector combination for genetic immunisation is of extraordinary importance in designing optimised DNA vaccines that, when required for inducing protective immune response, could elicit responses biased to antigen-specific antibodies or cytotoxic T cells generation.

  5. The incorporation of epigenetics in artificial gene regulatory networks.

    PubMed

    Turner, Alexander P; Lones, Michael A; Fuente, Luis A; Stepney, Susan; Caves, Leo S D; Tyrrell, Andy M

    2013-05-01

    Artificial gene regulatory networks are computational models that draw inspiration from biological networks of gene regulation. Since their inception they have been used to infer knowledge about gene regulation and as methods of computation. These computational models have been shown to possess properties typically found in the biological world, such as robustness and self organisation. Recently, it has become apparent that epigenetic mechanisms play an important role in gene regulation. This paper describes a new model, the Artificial Epigenetic Regulatory Network (AERN) which builds upon existing models by adding an epigenetic control layer. Our results demonstrate that AERNs are more adept at controlling multiple opposing trajectories when applied to a chaos control task within a conservative dynamical system, suggesting that AERNs are an interesting area for further investigation.

  6. Angiotensin II-regulated transcription regulatory genes in adrenal steroidogenesis.

    PubMed

    Romero, Damian G; Gomez-Sanchez, Elise P; Gomez-Sanchez, Celso E

    2010-11-29

    Transcription regulatory genes are crucial modulators of cell physiology and metabolism whose intracellular levels are tightly controlled in response to extracellular stimuli. We previously reported a set of 29 transcription regulatory genes modulated by angiotensin II in H295R human adrenocortical cells and their roles in regulating the expression of the last and unique enzymes of the glucocorticoid and mineralocorticoid biosynthetic pathways, 11β-hydroxylase and aldosterone synthase, respectively, using gene expression reporter assays. To study the effect of this set of transcription regulatory genes on adrenal steroidogenesis, H295R cells were transfected by high-efficiency nucleofection and aldosterone and cortisol were measured in cell culture supernatants under basal and angiotensin II-stimulated conditions. BCL11B, BHLHB2, CITED2, ELL2, HMGA1, MAFF, NFIL3, PER1, SERTAD1, and VDR significantly stimulated aldosterone secretion, while EGR1, FOSB, and ZFP295 decreased aldosterone secretion. BTG2, HMGA1, MITF, NR4A1, and ZFP295 significantly increased cortisol secretion, while BCL11B, NFIL3, PER1, and SIX2 decreased cortisol secretion. We also report the effect of some of these regulators on the expression of endogenous aldosterone synthase and 11β-hydroxylase under basal and angiotensin II-stimulated conditions. In summary, this study reports for the first time the effects of a set of angiotensin II-modulated transcription regulatory genes on aldosterone and cortisol secretion and the expression levels of the last and unique enzymes of the mineralocorticoid and glucocorticoid biosynthetic pathways. Abnormal regulation of mineralocorticoid or glucocorticoid secretion is involved in several pathophysiological conditions. These transcription regulatory genes may be involved in adrenal steroidogenesis pathologies; thus they merit additional study as potential candidates for therapeutic intervention.

  7. A gene regulatory network armature for T-lymphocyte specification

    SciTech Connect

    Fung, Elizabeth-sharon

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  8. Nemertean Toxin Genes Revealed through Transcriptome Sequencing

    PubMed Central

    Whelan, Nathan V.; Kocot, Kevin M.; Santos, Scott R.; Halanych, Kenneth M.

    2014-01-01

    Nemerteans are one of few animal groups that have evolved the ability to utilize toxins for both defense and subduing prey, but little is known about specific nemertean toxins. In particular, no study has identified specific toxin genes even though peptide toxins are known from some nemertean species. Information about toxin genes is needed to better understand evolution of toxins across animals and possibly provide novel targets for pharmaceutical and industrial applications. We sequenced and annotated transcriptomes of two free-living and one commensal nemertean and annotated an additional six publicly available nemertean transcriptomes to identify putative toxin genes. Approximately 63–74% of predicted open reading frames in each transcriptome were annotated with gene names, and all species had similar percentages of transcripts annotated with each higher-level GO term. Every nemertean analyzed possessed genes with high sequence similarities to known animal toxins including those from stonefish, cephalopods, and sea anemones. One toxin-like gene found in all nemerteans analyzed had high sequence similarity to Plancitoxin-1, a DNase II hepatotoxin that may function well at low pH, which suggests that the acidic body walls of some nemerteans could work to enhance the efficacy of protein toxins. The highest number of toxin-like genes found in any one species was seven and the lowest was three. The diversity of toxin-like nemertean genes found here is greater than previously documented, and these animals are likely an ideal system for exploring toxin evolution and industrial applications of toxins. PMID:25432940

  9. Efficient reverse-engineering of a developmental gene regulatory network.

    PubMed

    Crombach, Anton; Wotton, Karl R; Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

    2012-01-01

    Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to

  10. Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

    PubMed Central

    Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

    2012-01-01

    Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to

  11. Inferring transcription factor collaborations in gene regulatory networks

    PubMed Central

    2014-01-01

    Background Living cells are realized by complex gene expression programs that are moderated by regulatory proteins called transcription factors (TFs). The TFs control the differential expression of target genes in the context of transcriptional regulatory networks (TRNs), either individually or in groups. Deciphering the mechanisms of how the TFs control the expression of target genes is a challenging task, especially when multiple TFs collaboratively participate in the transcriptional regulation. Results We model the underlying regulatory interactions in terms of the directions (activation or repression) and their logical roles (necessary and/or sufficient) with a modified association rule mining approach, called mTRIM. The experiment on Yeast discovered 670 regulatory interactions, in which multiple TFs express their functions on common target genes collaboratively. The evaluation on yeast genetic interactions, TF knockouts and a synthetic dataset shows that our algorithm is significantly better than the existing ones. Conclusions mTRIM is a novel method to infer TF collaborations in transcriptional regulation networks. mTRIM is available at http://www.msu.edu/~jinchen/mTRIM. PMID:24565025

  12. Compartmentalized gene regulatory network of the pathogenic fungus Fusarium graminearum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Head blight caused by Fusarium graminearum (Fg) is a major limiting factor of wheat production with both yield loss and mycotoxin contamination. Here we report a model for global Fg gene regulatory networks (GRNs) inferred from a large collection of transcriptomic data using a machine-learning appro...

  13. Second order optimization for the inference of gene regulatory pathways.

    PubMed

    Das, Mouli; Murthy, Chivukula A; De, Rajat K

    2014-02-01

    With the increasing availability of experimental data on gene interactions, modeling of gene regulatory pathways has gained special attention. Gradient descent algorithms have been widely used for regression and classification applications. Unfortunately, results obtained after training a model by gradient descent are often highly variable. In this paper, we present a new second order learning rule based on the Newton's method for inferring optimal gene regulatory pathways. Unlike the gradient descent method, the proposed optimization rule is independent of the learning parameter. The flow vectors are estimated based on biomass conservation. A set of constraints is formulated incorporating weighting coefficients. The method calculates the maximal expression of the target gene starting from a given initial gene through these weighting coefficients. Our algorithm has been benchmarked and validated on certain types of functions and on some gene regulatory networks, gathered from literature. The proposed method has been found to perform better than the gradient descent learning. Extensive performance comparison with the extreme pathway analysis method has underlined the effectiveness of our proposed methodology.

  14. A Genome-Wide Regulatory Framework Identifies Maize Pericarp Color1 Controlled Genes[C][W

    PubMed Central

    Morohashi, Kengo; Casas, María Isabel; Ferreyra, Lorena Falcone; Mejía-Guerra, María Katherine; Pourcel, Lucille; Yilmaz, Alper; Feller, Antje; Carvalho, Bruna; Emiliani, Julia; Rodriguez, Eduardo; Pellegrinet, Silvina; McMullen, Michael; Casati, Paula; Grotewold, Erich

    2012-01-01

    Pericarp Color1 (P1) encodes an R2R3-MYB transcription factor responsible for the accumulation of insecticidal flavones in maize (Zea mays) silks and red phlobaphene pigments in pericarps and other floral tissues, which makes P1 an important visual marker. Using genome-wide expression analyses (RNA sequencing) in pericarps and silks of plants with contrasting P1 alleles combined with chromatin immunoprecipitation coupled with high-throughput sequencing, we show here that the regulatory functions of P1 are much broader than the activation of genes corresponding to enzymes in a branch of flavonoid biosynthesis. P1 modulates the expression of several thousand genes, and ∼1500 of them were identified as putative direct targets of P1. Among them, we identified F2H1, corresponding to a P450 enzyme that converts naringenin into 2-hydroxynaringenin, a key branch point in the P1-controlled pathway and the first step in the formation of insecticidal C-glycosyl flavones. Unexpectedly, the binding of P1 to gene regulatory regions can result in both gene activation and repression. Our results indicate that P1 is the major regulator for a set of genes involved in flavonoid biosynthesis and a minor modulator of the expression of a much larger gene set that includes genes involved in primary metabolism and production of other specialized compounds. PMID:22822204

  15. Full-Length Minor Ampullate Spidroin Gene Sequence

    PubMed Central

    Chen, Gefei; Liu, Xiangqin; Zhang, Yunlong; Lin, Senzhu; Yang, Zijiang; Johansson, Jan; Rising, Anna; Meng, Qing

    2012-01-01

    Spider silk includes seven protein based fibers and glue-like substances produced by glands in the spider's abdomen. Minor ampullate silk is used to make the auxiliary spiral of the orb-web and also for wrapping prey, has a high tensile strength and does not supercontract in water. So far, only partial cDNA sequences have been obtained for minor ampullate spidroins (MiSps). Here we describe the first MiSp full-length gene sequence from the spider species Araneus ventricosus, using a multidimensional PCR approach. Comparative analysis of the sequence reveals regulatory elements, as well as unique spidroin gene and protein architecture including the presence of an unusually large intron. The spliced full-length transcript of MiSp gene is 5440 bp in size and encodes 1766 amino acid residues organized into conserved nonrepetitive N- and C-terminal domains and a central predominantly repetitive region composed of four units that are iterated in a non regular manner. The repeats are more conserved within A. ventricosus MiSp than compared to repeats from homologous proteins, and are interrupted by two nonrepetitive spacer regions, which have 100% identity even at the nucleotide level. PMID:23251707

  16. Isolation and computer analysis of the 5'-regulatory region of the seed storage protein gene from buckwheat (Fagopyrum esculentum Moench).

    PubMed

    Milisavljević, Mira Dj; Konstantinović, Miroslav M; Brkljacić, Jelena M; Maksimović, Vesna R

    2005-03-23

    Using the modified rapid amplification of cDNA ends (5'-RACE) approach, a fragment containing the 955 bp long 5'-regulatory region of the buckwheat storage globulin gene (FeLEG1) has been amplified from the genomic DNA of buckwheat. The entire fragment was sequenced, and the sequence was analyzed by computer prediction of cis-regulatory elements possibly involved in tissue-specific and developmentally controlled seed storage protein gene expression. The promoter obtained might be interesting not only for fundamental research but also as a useful tool for biotechnological application.

  17. Impacts of Neanderthal-Introgressed Sequences on the Landscape of Human Gene Expression.

    PubMed

    McCoy, Rajiv C; Wakefield, Jon; Akey, Joshua M

    2017-02-23

    Regulatory variation influencing gene expression is a key contributor to phenotypic diversity, both within and between species. Unfortunately, RNA degrades too rapidly to be recovered from fossil remains, limiting functional genomic insights about our extinct hominin relatives. Many Neanderthal sequences survive in modern humans due to ancient hybridization, providing an opportunity to assess their contributions to transcriptional variation and to test hypotheses about regulatory evolution. We developed a flexible Bayesian statistical approach to quantify allele-specific expression (ASE) in complex RNA-seq datasets. We identified widespread expression differences between Neanderthal and modern human alleles, indicating pervasive cis-regulatory impacts of introgression. Brain regions and testes exhibited significant downregulation of Neanderthal alleles relative to other tissues, consistent with natural selection influencing the tissue-specific regulatory landscape. Our study demonstrates that Neanderthal-inherited sequences are not silent remnants of ancient interbreeding but have measurable impacts on gene expression that contribute to variation in modern human phenotypes.

  18. Molecular characterization of a maize regulatory gene

    SciTech Connect

    Wessler, S.R.

    1991-12-01

    Based on initial bombardment studies we have previously concluded that promoter diversity was responsible for the diversity of naturally occurring R alleles. During this period we have found that R is controlled at the level of translation initiation and intron 1 is alternatively spliced. The experiments described in Sections 1 and 2 sought to quantify these effects and to determine whether they contribute to the tissue specific expression of select R alleles. This study was done because very little is understood about the post-transcriptional regulation of plant genes. Section 3 and 4 describe experiments designed to identify important structural components of the R protein.

  19. Data- and knowledge-based modeling of gene regulatory networks: an update

    PubMed Central

    Linde, Jörg; Schulze, Sylvie; Henkel, Sebastian G.; Guthke, Reinhard

    2015-01-01

    Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions. PMID:27047314

  20. Inferring slowly-changing dynamic gene-regulatory networks

    PubMed Central

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with ℓ1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset. PMID:25917062

  1. The Transcriptional and Gene Regulatory Network of Lactococcus lactis MG1363 during Growth in Milk

    PubMed Central

    de Jong, Anne; Hansen, Morten E.; Kuipers, Oscar P.; Kilstrup, Mogens; Kok, Jan

    2013-01-01

    In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs) are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks. PMID:23349698

  2. Next-generation tag sequencing for cancer gene expression profiling.

    PubMed

    Morrissy, A Sorana; Morin, Ryan D; Delaney, Allen; Zeng, Thomas; McDonald, Helen; Jones, Steven; Zhao, Yongjun; Hirst, Martin; Marra, Marco A

    2009-10-01

    We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.

  3. Allelic polymorphism in transcriptional regulatory regions of HLA-DQB genes

    PubMed Central

    1991-01-01

    Class II genes of the human major histocompatibility complex (MHC) are highly polymorphic. Allelic variation of structural genes provides diversity in immune cell interactions, contributing to the formation of the T cell repertoire and to susceptibility to certain autoimmune diseases. We now report that allelic polymorphism also exists in the promoter and upstream regulatory regions (URR) of human histocompatibility leukocyte antigen (HLA) class II genes. Nucleotide sequencing of these regulatory regions of seven alleles of the DQB locus reveals a number of allele-specific polymorphisms, some of which lie in functionally critical consensus regions thought to be highly conserved in class II promoters. These sequence differences also correspond to allelic differences in binding of nuclear proteins to the URR. Fragments of the URR of two DQB alleles were analyzed for binding to nuclear proteins extracted from human B lymphoblastoid cell lines (B- LCL). Gel retardation assays showed substantially different banding patterns to the two promoters, including prominent variation in nuclear protein binding to the partially conserved X box regions and a novel upstream polymorphic sequence element. Comparison of these two polymorphic alleles in a transient expression system demonstrated a marked difference in their promoter strengths determined by relative abilities to initiate transcription of the chloramphenicol acetyltransferase reporter gene in human B-LCL. Shuttling of URR sequences between alleles showed that functional variation corresponded to both the X box and upstream sequence polymorphic sites. These findings identify an important source of MHC class II diversity, and suggest the possibility that such regulatory region polymorphisms may confer allelic differences in expression, inducibility, and/or tissue specificity of class II molecules. PMID:1985121

  4. Topological origin of global attractors in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhang, YunJun; Ouyang, Qi; Geng, Zhi

    2015-02-01

    Fixed-point attractors with global stability manifest themselves in a number of gene regulatory networks. This property indicates the stability of regulatory networks against small state perturbations and is closely related to other complex dynamics. In this paper, we aim to reveal the core modules in regulatory networks that determine their global attractors and the relationship between these core modules and other motifs. This work has been done via three steps. Firstly, inspired by the signal transmission in the regulation process, we extract the model of chain-like network from regulation networks. We propose a module of "ideal transmission chain (ITC)", which is proved sufficient and necessary (under certain condition) to form a global fixed-point in the context of chain-like network. Secondly, by examining two well-studied regulatory networks (i.e., the cell-cycle regulatory networks of Budding yeast and Fission yeast), we identify the ideal modules in true regulation networks and demonstrate that the modules have a superior contribution to network stability (quantified by the relative size of the biggest attraction basin). Thirdly, in these two regulation networks, we find that the double negative feedback loops, which are the key motifs of forming bistability in regulation, are connected to these core modules with high network stability. These results have shed new light on the connection between the topological feature and the dynamic property of regulatory networks.

  5. Modularity and evolutionary constraints in a baculovirus gene regulatory network

    PubMed Central

    2013-01-01

    Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates

  6. Gap Gene Regulatory Dynamics Evolve along a Genotype Network

    PubMed Central

    Crombach, Anton; Wotton, Karl R.; Jiménez-Guri, Eva; Jaeger, Johannes

    2016-01-01

    Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as “system drift.” System drift is illustrated by the gap gene network—involved in segmental patterning—in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of “genotype networks” and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability). PMID:26796549

  7. Regulatory hotspots are associated with plant gene expression under varying soil phosphorus supply in Brassica rapa.

    PubMed

    Hammond, John P; Mayes, Sean; Bowen, Helen C; Graham, Neil S; Hayden, Rory M; Love, Christopher G; Spracklen, William P; Wang, Jun; Welham, Sue J; White, Philip J; King, Graham J; Broadley, Martin R

    2011-07-01

    Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have defined cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identification of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efficiency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efficiency identified through eQTL analysis are potential targets for further characterization and may have potential for crop improvement.

  8. Identification of cancer-related genes and motifs in the human gene regulatory network.

    PubMed

    Carson, Matthew B; Gu, Jianlei; Yu, Guangjun; Lu, Hui

    2015-08-01

    The authors investigated the regulatory network motifs and corresponding motif positions of cancer-related genes. First, they mapped disease-related genes to a transcription factor regulatory network. Next, they calculated statistically significant motifs and subsequently identified positions within these motifs that were enriched in cancer-related genes. Potential mechanisms of these motifs and positions are discussed. These results could be used to identify other disease- and cancer-related genes and could also suggest mechanisms for how these genes relate to co-occurring diseases.

  9. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  10. Constraint and contingency in multifunctional gene regulatory circuits.

    PubMed

    Payne, Joshua L; Wagner, Andreas

    2013-01-01

    Gene regulatory circuits drive the development, physiology, and behavior of organisms from bacteria to humans. The phenotypes or functions of such circuits are embodied in the gene expression patterns they form. Regulatory circuits are typically multifunctional, forming distinct gene expression patterns in different embryonic stages, tissues, or physiological states. Any one circuit with a single function can be realized by many different regulatory genotypes. Multifunctionality presumably constrains this number, but we do not know to what extent. We here exhaustively characterize a genotype space harboring millions of model regulatory circuits and all their possible functions. As a circuit's number of functions increases, the number of genotypes with a given number of functions decreases exponentially but can remain very large for a modest number of functions. However, the sets of circuits that can form any one set of functions becomes increasingly fragmented. As a result, historical contingency becomes widespread in circuits with many functions. Whether a circuit can acquire an additional function in the course of its evolution becomes increasingly dependent on the function it already has. Circuits with many functions also become increasingly brittle and sensitive to mutation. These observations are generic properties of a broad class of circuits and independent of any one circuit genotype or phenotype.

  11. Genomic imprinting-an epigenetic gene-regulatory model.

    PubMed

    Koerner, Martha V; Barlow, Denise P

    2010-04-01

    Epigenetic mechanisms (Box 1) are considered to play major gene-regulatory roles in development, differentiation and disease. However, the relative importance of epigenetics in defining the mammalian transcriptome in normal and disease states is unknown. The mammalian genome contains only a few model systems where epigenetic gene regulation has been shown to play a major role in transcriptional control. These model systems are important not only to investigate the biological function of known epigenetic modifications but also to identify new and unexpected epigenetic mechanisms in the mammalian genome. Here we review recent progress in understanding how epigenetic mechanisms control imprinted gene expression.

  12. Additive Functions in Boolean Models of Gene Regulatory Network Modules

    PubMed Central

    Darabos, Christian; Di Cunto, Ferdinando; Tomassini, Marco; Moore, Jason H.; Provero, Paolo; Giacobini, Mario

    2011-01-01

    Gene-on-gene regulations are key components of every living organism. Dynamical abstract models of genetic regulatory networks help explain the genome's evolvability and robustness. These properties can be attributed to the structural topology of the graph formed by genes, as vertices, and regulatory interactions, as edges. Moreover, the actual gene interaction of each gene is believed to play a key role in the stability of the structure. With advances in biology, some effort was deployed to develop update functions in Boolean models that include recent knowledge. We combine real-life gene interaction networks with novel update functions in a Boolean model. We use two sub-networks of biological organisms, the yeast cell-cycle and the mouse embryonic stem cell, as topological support for our system. On these structures, we substitute the original random update functions by a novel threshold-based dynamic function in which the promoting and repressing effect of each interaction is considered. We use a third real-life regulatory network, along with its inferred Boolean update functions to validate the proposed update function. Results of this validation hint to increased biological plausibility of the threshold-based function. To investigate the dynamical behavior of this new model, we visualized the phase transition between order and chaos into the critical regime using Derrida plots. We complement the qualitative nature of Derrida plots with an alternative measure, the criticality distance, that also allows to discriminate between regimes in a quantitative way. Simulation on both real-life genetic regulatory networks show that there exists a set of parameters that allows the systems to operate in the critical region. This new model includes experimentally derived biological information and recent discoveries, which makes it potentially useful to guide experimental research. The update function confers additional realism to the model, while reducing the complexity

  13. Additive functions in boolean models of gene regulatory network modules.

    PubMed

    Darabos, Christian; Di Cunto, Ferdinando; Tomassini, Marco; Moore, Jason H; Provero, Paolo; Giacobini, Mario

    2011-01-01

    Gene-on-gene regulations are key components of every living organism. Dynamical abstract models of genetic regulatory networks help explain the genome's evolvability and robustness. These properties can be attributed to the structural topology of the graph formed by genes, as vertices, and regulatory interactions, as edges. Moreover, the actual gene interaction of each gene is believed to play a key role in the stability of the structure. With advances in biology, some effort was deployed to develop update functions in boolean models that include recent knowledge. We combine real-life gene interaction networks with novel update functions in a boolean model. We use two sub-networks of biological organisms, the yeast cell-cycle and the mouse embryonic stem cell, as topological support for our system. On these structures, we substitute the original random update functions by a novel threshold-based dynamic function in which the promoting and repressing effect of each interaction is considered. We use a third real-life regulatory network, along with its inferred boolean update functions to validate the proposed update function. Results of this validation hint to increased biological plausibility of the threshold-based function. To investigate the dynamical behavior of this new model, we visualized the phase transition between order and chaos into the critical regime using Derrida plots. We complement the qualitative nature of Derrida plots with an alternative measure, the criticality distance, that also allows to discriminate between regimes in a quantitative way. Simulation on both real-life genetic regulatory networks show that there exists a set of parameters that allows the systems to operate in the critical region. This new model includes experimentally derived biological information and recent discoveries, which makes it potentially useful to guide experimental research. The update function confers additional realism to the model, while reducing the complexity

  14. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library

    PubMed Central

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Aim Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. Methods After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Results Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. Significance This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes

  15. Effects of Four Different Regulatory Mechanisms on the Dynamics of Gene Regulatory Cascades

    PubMed Central

    Hansen, Sabine; Krishna, Sandeep; Semsey, Szabolcs; Lo Svenningsen, Sine

    2015-01-01

    Gene regulatory cascades (GRCs) are common motifs in cellular molecular networks. A given logical function in these cascades, such as the repression of the activity of a transcription factor, can be implemented by a number of different regulatory mechanisms. The potential consequences for the dynamic performance of the GRC of choosing one mechanism over another have not been analysed systematically. Here, we report the construction of a synthetic GRC in Escherichia coli, which allows us for the first time to directly compare and contrast the dynamics of four different regulatory mechanisms, affecting the transcription, translation, stability, or activity of a transcriptional repressor. We developed a biologically motivated mathematical model which is sufficient to reproduce the response dynamics determined by experimental measurements. Using the model, we explored the potential response dynamics that the constructed GRC can perform. We conclude that dynamic differences between regulatory mechanisms at an individual step in a GRC are often concealed in the overall performance of the GRC, and suggest that the presence of a given regulatory mechanism in a certain network environment does not necessarily mean that it represents a single optimal evolutionary solution. PMID:26184971

  16. Genome-Wide Identification of Regulatory Elements and Reconstruction of Gene Regulatory Networks of the Green Alga Chlamydomonas reinhardtii under Carbon Deprivation

    PubMed Central

    Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

    2013-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO2 response regulator 1) and Lcr2 (Low-CO2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can

  17. Mapping gene regulatory circuitry of Pax6 during neurogenesis

    PubMed Central

    Thakurela, Sudhir; Tiwari, Neha; Schick, Sandra; Garding, Angela; Ivanek, Robert; Berninger, Benedikt; Tiwari, Vijay K

    2016-01-01

    Pax6 is a highly conserved transcription factor among vertebrates and is important in various aspects of the central nervous system development. However, the gene regulatory circuitry of Pax6 underlying these functions remains elusive. We find that Pax6 targets a large number of promoters in neural progenitors cells. Intriguingly, many of these sites are also bound by another progenitor factor, Sox2, which cooperates with Pax6 in gene regulation. A combinatorial analysis of Pax6-binding data set with transcriptome changes in Pax6-deficient neural progenitors reveals a dual role for Pax6, in which it activates the neuronal (ectodermal) genes while concurrently represses the mesodermal and endodermal genes, thereby ensuring the unidirectionality of lineage commitment towards neuronal differentiation. Furthermore, Pax6 is critical for inducing activity of transcription factors that elicit neurogenesis and repress others that promote non-neuronal lineages. In addition to many established downstream effectors, Pax6 directly binds and activates a number of genes that are specifically expressed in neural progenitors but have not been previously implicated in neurogenesis. The in utero knockdown of one such gene, Ift74, during brain development impairs polarity and migration of newborn neurons. These findings demonstrate new aspects of the gene regulatory circuitry of Pax6, revealing how it functions to control neuronal development at multiple levels to ensure unidirectionality and proper execution of the neurogenic program. PMID:27462442

  18. Characterization of nif regulatory genes in Rhodopseudomonas capsulata using lac gene fusions.

    PubMed

    Kranz, R G; Haselkorn, R

    1985-01-01

    Translational fusions of the Escherichia coli lacZYA operon to Rhodopseudomonas capsulata nif genes were obtained by using mini-MudII1734 [Castilho et al., J. Bacteriol. 158 (1984) 488-495] inserts into cloned fragments of R. capsulata DNA. A lac fusion to the nifH gene, which encodes dinitrogenase reductase, was used to classify Nif- mutations occurring in regulatory genes. Nine mutations were unable to activate nifHDK transcription. The nine mutations define four nif regulatory genes. Three of these genes are located on the same R. capsulata 8.4-kb EcoRI fragment. Each is transcribed independently. One of these (complementing mutant J61) is partially homologous with the ntrC gene of Escherichia coli, based on Southern hybridization. The fourth nif regulatory gene (complementing mutants LJ1, AH1 and AH3) is unlinked to the others. Lac fusions to all four regulatory genes were constructed. Each regulatory gene is weakly expressed compared to derepressed nifH and partially repressed in the presence of ammonia.

  19. EXAMINE: a computational approach to reconstructing gene regulatory networks.

    PubMed

    Deng, Xutao; Geng, Huimin; Ali, Hesham

    2005-08-01

    Reverse-engineering of gene networks using linear models often results in an underdetermined system because of excessive unknown parameters. In addition, the practical utility of linear models has remained unclear. We address these problems by developing an improved method, EXpression Array MINing Engine (EXAMINE), to infer gene regulatory networks from time-series gene expression data sets. EXAMINE takes advantage of sparse graph theory to overcome the excessive-parameter problem with an adaptive-connectivity model and fitting algorithm. EXAMINE also guarantees that the most parsimonious network structure will be found with its incremental adaptive fitting process. Compared to previous linear models, where a fully connected model is used, EXAMINE reduces the number of parameters by O(N), thereby increasing the chance of recovering the underlying regulatory network. The fitting algorithm increments the connectivity during the fitting process until a satisfactory fit is obtained. We performed a systematic study to explore the data mining ability of linear models. A guideline for using linear models is provided: If the system is small (3-20 elements), more than 90% of the regulation pathways can be determined correctly. For a large-scale system, either clustering is needed or it is necessary to integrate information in addition to expression profile. Coupled with the clustering method, we applied EXAMINE to rat central nervous system development (CNS) data with 112 genes. We were able to efficiently generate regulatory networks with statistically significant pathways that have been predicted previously.

  20. Detection and sequence analysis of accessory gene regulator genes of Staphylococcus pseudintermedius isolates

    PubMed Central

    Chitra, M. Ananda; Jayanthy, C.; Nagarajan, B.

    2015-01-01

    SP contains serine and produce lactone ring structured AIP. Conclusion: Presence of AgrA, B, and D in all SP isolates implies the importance of this regulatory system in the virulence genes expression of the SP bacteria. SP isolates can be typed based on the AgrD auto-inducible protein sequences as it is being carried out for typing of S. aureus isolates. However, further studies are required to elucidate the mechanism of controlling of virulence genes by agr gene locus in the pathogenesis of soft tissue infection by SP. PMID:27047173

  1. Exceptionally high heterologous protein levels in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences.

    PubMed

    De Jaeger, Geert; Angenon, Geert; Depicker, Ann

    2003-01-01

    Seeds are concentrated sources of protein and thus may be ideal 'bioreactors' for the production of heterologous proteins. For this application, strong seed-specific expression signals are required. A set of expression cassettes were designed using 5' and 3' regulatory sequences of the seed storage protein gene arcelin 5-I (arc5-I) from Phaseolus vulgaris, and evaluated for the production of heterologous proteins in dicotyledonous plant species. A murine single-chain variable fragment (scFv) was chosen as model protein because of the current industrial interest to produce antibodies and derived fragments in crops. Because the highest scFv accumulation in seed had previously been achieved in the endoplasmic reticulum (ER), the scFv-encoding sequence was provided with signal sequences for accumulation in the ER. Transgenic Arabidopsis seed stocks, expressing the scFv under control of the 35S promoter, contained scFv accumulation levels in the range of 1% of total soluble protein (TSP). However, the seed storage promoter constructs boosted the scFv to exceptionally high levels. Maximum scFv levels were obtained in homozygous seed stocks, being 12.5% of TSP under control of the arc5-I regulatory sequences and even up to 36.5% of TSP upon replacing the arc5-I promoter by the beta-phaseolin promoter of Phaseolus vulgaris. Even at such very high levels, the scFv proteins retain their full antigen-binding activity. Moreover, the presence of very high scFv levels has only minory effects on seed germination and no effect on seed production. These results demonstrate that the expression levels of arcelin 5-I and beta-phaseolin seed storage protein genes can be transferred to heterologous proteins, giving exceptionally high levels of heterologous proteins, which can be of great value for the molecular farming industry by raising production yield and lowering bio-mass production and purification costs. Finally, the feasibility of heterologous protein production using the

  2. Gene therapy for cancer: regulatory considerations for approval.

    PubMed

    Husain, S R; Han, J; Au, P; Shannon, K; Puri, R K

    2015-12-01

    The rapidly changing field of gene therapy promises a number of innovative treatments for cancer patients. Advances in genetic modification of cancer and immune cells and the use of oncolytic viruses and bacteria have led to numerous clinical trials for cancer therapy, with several progressing to late-stage product development. At the time of this writing, no gene therapy product has been approved by the United States Food and Drug Administration (FDA). Some of the key scientific and regulatory issues include understanding of gene transfer vector biology, safety of vectors in vitro and in animal models, optimum gene transfer, long-term persistence or integration in the host, shedding of a virus and ability to maintain transgene expression in vivo for a desired period of time. Because of the biological complexity of these products, the FDA encourages a flexible, data-driven approach for preclinical safety testing programs. The clinical trial design should be based on the unique features of gene therapy products, and should ensure the safety of enrolled subjects. This article focuses on regulatory considerations for gene therapy product development and also discusses guidance documents that have been published by the FDA.

  3. Gene structure, regulatory control, and evolution of black widow venom latrotoxins

    PubMed Central

    Bhere, Kanaka Varun; Haney, Robert A.; Ayoub, Nadia A.; Garb, Jessica E.

    2014-01-01

    Black widow venom contains α-latrotoxin, infamous for causing intense pain. Combining 33 kb of Latrodectus hesperus genomic DNA with RNA-Seq, we characterized the α-latrotoxin gene and discovered a paralog, 4.5 kb downstream. Both paralogs exhibit venom gland specific transcription, and may be regulated post-transcriptionally via musashi-like proteins. A 4 kb intron interrupts the α-latrotoxin coding sequence, while a 10 kb intron in the 3′ UTR of the paralog may cause nonsense-mediated decay. Phylogenetic analysis confirms these divergent latrotoxins diversified through recent tandem gene duplications. Thus, latrotoxin genes have more complex structures, regulatory controls, and sequence diversity than previously proposed. PMID:25217831

  4. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  5. DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

    PubMed

    El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

    2007-01-01

    We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.

  6. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  7. Analyzing stationary states of gene regulatory network using petri nets.

    PubMed

    Gambin, Anna; Lasota, Sławomir; Rutkowski, Michał

    2006-01-01

    We introduce and formally define the notion of a stationary state for Petri nets. We also propose a fully automatic method for finding such states. The procedure makes use of the Presburger arithmetic to describe all the stationary states. Finally we apply this novel approach to find stationary states of a gene regulatory network describing the flower morphogenesis of A. thaliana. This shows that the proposed method can be successfully applied in the study of biological systems.

  8. Analyzing stationary States of gene regulatory network using petri nets.

    PubMed

    Gambin, Anna; Lasota, Sławomir; Rutkowski, Michał

    2011-01-01

    We introduce and formally define the notion of a stationary state for Petri nets. We also propose a fully automatic method for finding such states. The procedure makes use of the Presburger arithmetic to describe all the stationary states. Finally we apply this novel approach to find stationary states of a gene regulatory network describing the flower morphogenesis of A. thaliana. This shows that the proposed method can be successfully applied in the study of biological systems.

  9. Enhancer Sequence Variants and Transcription Factor Deregulation Synergize to Construct Pathogenic Regulatory Circuits in B Cell Lymphoma

    PubMed Central

    Koues, Olivia I.; Kowalewski, Rodney A.; Chang, Li-Wei; Pyfrom, Sarah C.; Schmidt, Jennifer A.; Luo, Hong; Sandoval, Luis E.; Hughes, Tyler B.; Bednarski, Jeffrey J.; Cashen, Amanda F.; Payton, Jacqueline E.; Oltz, Eugene M.

    2014-01-01

    Summary Most B cell lymphomas arise in the germinal center (GC), where humoral immune responses evolve from potentially oncogenic cycles of mutation, proliferation, and clonal selection. Although lymphoma gene expression diverges significantly from GC-B cells, underlying mechanisms that alter the activities of corresponding regulatory elements (REs) remain elusive. Here we define the complete pathogenic circuitry of human follicular lymphoma (FL), which activates or decommissions REs from normal GC-B cells and commandeers enhancers from other lineages. Moreover, independent sets of transcription factors, whose expression was deregulated in FL, targeted commandeered versus decommissioned REs. Our approach revealed two distinct subtypes of low-grade FL, whose pathogenic circuitries resembled GC-B or activated B cells. FL-altered enhancers also were enriched for sequence variants, including somatic mutations, which disrupt transcription factor binding and expression of circuit-linked genes. Thus, the pathogenic regulatory circuitry of FL reveals distinct genetic and epigenetic etiologies for GC-B transformation. PMID:25607463

  10. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  11. Phase transitions in the evolution of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Skanata, Antun; Kussell, Edo

    The role of gene regulatory networks is to respond to environmental conditions and optimize growth of the cell. A typical example is found in bacteria, where metabolic genes are activated in response to nutrient availability, and are subsequently turned off to conserve energy when their specific substrates are depleted. However, in fluctuating environmental conditions, regulatory networks could experience strong evolutionary pressures not only to turn the right genes on and off, but also to respond optimally under a wide spectrum of fluctuation timescales. The outcome of evolution is predicted by the long-term growth rate, which differentiates between optimal strategies. Here we present an analytic computation of the long-term growth rate in randomly fluctuating environments, by using mean-field and higher order expansion in the environmental history. We find that optimal strategies correspond to distinct regions in the phase space of fluctuations, separated by first and second order phase transitions. The statistics of environmental randomness are shown to dictate the possible evolutionary modes, which either change the structure of the regulatory network abruptly, or gradually modify and tune the interactions between its components.

  12. Functional studies of regulatory genes in the sea urchin embryo.

    PubMed

    Cavalieri, Vincenzo; Di Bernardo, Maria; Spinelli, Giovanni

    2009-01-01

    Sea urchin embryos are characterized by an extremely simple mode of development, rapid cleavage, high transparency, and well-defined cell lineage. Although they are not suitable for genetic studies, other approaches are successfully used to unravel mechanisms and molecules involved in cell fate specification and morphogenesis. Microinjection is the elective method to study gene function in sea urchin embryos. It is used to deliver precise amounts of DNA, RNA, oligonucleotides, peptides, or antibodies into the eggs or even into blastomeres. Here we describe microinjection as it is currently applied in our laboratory and show how it has been used in gene perturbation analyses and dissection of cis-regulatory DNA elements.

  13. Roles of lignin biosynthesis and regulatory genes in plant development

    PubMed Central

    Yoon, Jinmi; Choi, Heebak

    2015-01-01

    Abstract Lignin is an important factor affecting agricultural traits, biofuel production, and the pulping industry. Most lignin biosynthesis genes and their regulatory genes are expressed mainly in the vascular bundles of stems and leaves, preferentially in tissues undergoing lignification. Other genes are poorly expressed during normal stages of development, but are strongly induced by abiotic or biotic stresses. Some are expressed in non‐lignifying tissues such as the shoot apical meristem. Alterations in lignin levels affect plant development. Suppression of lignin biosynthesis genes causes abnormal phenotypes such as collapsed xylem, bending stems, and growth retardation. The loss of expression by genes that function early in the lignin biosynthesis pathway results in more severe developmental phenotypes when compared with plants that have mutations in later genes. Defective lignin deposition is also associated with phenotypes of seed shattering or brittle culm. MYB and NAC transcriptional factors function as switches, and some homeobox proteins negatively control lignin biosynthesis genes. Ectopic deposition caused by overexpression of lignin biosynthesis genes or master switch genes induces curly leaf formation and dwarfism. PMID:26297385

  14. Roles of lignin biosynthesis and regulatory genes in plant development.

    PubMed

    Yoon, Jinmi; Choi, Heebak; An, Gynheung

    2015-11-01

    Lignin is an important factor affecting agricultural traits, biofuel production, and the pulping industry. Most lignin biosynthesis genes and their regulatory genes are expressed mainly in the vascular bundles of stems and leaves, preferentially in tissues undergoing lignification. Other genes are poorly expressed during normal stages of development, but are strongly induced by abiotic or biotic stresses. Some are expressed in non-lignifying tissues such as the shoot apical meristem. Alterations in lignin levels affect plant development. Suppression of lignin biosynthesis genes causes abnormal phenotypes such as collapsed xylem, bending stems, and growth retardation. The loss of expression by genes that function early in the lignin biosynthesis pathway results in more severe developmental phenotypes when compared with plants that have mutations in later genes. Defective lignin deposition is also associated with phenotypes of seed shattering or brittle culm. MYB and NAC transcriptional factors function as switches, and some homeobox proteins negatively control lignin biosynthesis genes. Ectopic deposition caused by overexpression of lignin biosynthesis genes or master switch genes induces curly leaf formation and dwarfism.

  15. Identification of C4 photosynthesis metabolism and regulatory-associated genes in Eleocharis vivipara by SSH.

    PubMed

    Chen, Taiyu; Ye, Rongjian; Fan, Xiaolei; Li, Xianghua; Lin, Yongjun

    2011-09-01

    This is the first effort to investigate the candidate genes involved in kranz developmental regulation and C(4) metabolic fluxes in Eleocharis vivipara, which is a leafless freshwater amphibious plant and possesses a distinct culms anatomy structure and photosynthetic pattern in contrasting environments. A terrestrial specific SSH library was constructed to investigate the genes involved in kranz anatomy developmental regulation and C(4) metabolic fluxes. A total of 73 ESTs and 56 unigenes in 384 clones were identified by array hybridization and sequencing. In total, 50 unigenes had homologous genes in the databases of rice and Arabidopsis. The real-time quantitative PCR results showed that most of the genes were accumulated in terrestrial culms and ABA-induced culms. The C(4) marker genes were stably accumulated during the culms development process in terrestrial culms. With respect to C(3) culms, C(4) photosynthesis metabolism consumed much more transporters and translocators related to ion metabolism, organic acids and carbohydrate metabolism, phosphate metabolism, amino acids metabolism, and lipids metabolism. Additionally, ten regulatory genes including five transcription factors, four receptor-like proteins, and one BURP protein were identified. These regulatory genes, which co-accumulated with the culms developmental stages, may play important roles in culms structure developmental regulation, bundle sheath chloroplast maturation, and environmental response. These results shed new light on the C(4) metabolic fluxes, environmental response, and anatomy structure developmental regulation in E. vivipara.

  16. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences

    PubMed Central

    Hughes, Jim R.; Cheng, Jan-Fang; Ventress, Nicki; Prabhakar, Shyam; Clark, Kevin; Anguita, Eduardo; De Gobbi, Marco; de Jong, Pieter; Rubin, Eddy; Higgs, Douglas R.

    2005-01-01

    An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of ≈238kb containing the human α globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs. PMID:15998734

  17. The Metarhizium anisopliae trp1 gene: cloning and regulatory analysis.

    PubMed

    Staats, Charley Christian; Silva, Marcia Suzana Nunes; Pinto, Paulo Marcos; Vainstein, Marilene Henning; Schrank, Augusto

    2004-07-01

    The trp1 gene from the entomopathogenic fungus Metarhizium anisopliae, cloned by heterologous hybridization with the plasmid carrying the trpC gene from Aspergillus nidulans, was sequence characterized. The predicted translation product has the conserved catalytic domains of glutamine amidotransferase (G domain), indoleglycerolphosphate synthase (C domain), and phosphoribosyl anthranilate isomerase (F domain) organized as NH2-G-C-F-COOH. The ORF is interrupted by a single intron of 60 nt that is position conserved in relation to trp genes from Ascomycetes and length conserved in relation to Basidiomycetes species. RT-PCR analysis suggests constitutive expression of trp1 gene in M. anisopliae.

  18. An Arabidopsis gene regulatory network for secondary cell wall synthesis

    SciTech Connect

    Taylor-Teeples, M.; Lin, L.; de Lucas, M.; Turco, G.; Toal, T. W.; Gaudinier, A.; Young, N. F.; Trabucco, G. M.; Veling, M. T.; Lamothe, R.; Handakumbura, P. P.; Xiong, G.; Wang, C.; Corwin, J.; Tsoukalas, A.; Zhang, L.; Ware, D.; Pauly, M.; Kliebenstein, D. J.; Dehesh, K.; Tagkopoulos, I.; Breton, G.; Pruneda-Paz, J. L.; Ahnert, S. E.; Kay, S. A.; Hazen, S. P.; Brady, S. M.

    2014-12-24

    The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. In this paper, we present a protein–DNA network between Arabidopsis thaliana transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. Finally, these interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.

  19. BLG-e1 - a novel regulatory element in the distal region of the beta-lactoglobulin gene promoter.

    PubMed

    Reichenstein, Moshe; German, Tania; Barash, Itamar

    2005-04-11

    beta-Lactoglobulin (BLG) is a major ruminant milk protein. A regulatory element, termed BLG-e1, was defined in the distal region of the ovine BLG gene promoter. This 299-bp element lacks the established cis-regulatory sequences that affect milk-protein gene expression. Nevertheless, it alters the binding of downstream BLG sequences to histone H4 and the sensitivity of the histone-DNA complexes to trichostatin A treatment. In mammary cells cultured under favorable lactogenic conditions, BLG-e1 acts as a potent, position-independent silencer of BLG/luciferase expression, and similarly affects the promoter activity of the mouse whey acidic protein gene. Intragenic sequences upstream of BLG exon 2 reverse the silencing effect of BLG-e1 in vitro and in transgenic mice.

  20. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes

    PubMed Central

    Yao, Lijing; Berman, Benjamin P.; Farnham, Peggy J.

    2015-01-01

    Abstract Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer–target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers. PMID:26446758

  1. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    PubMed

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  2. Isolation and characterization of the 5'-flanking sequence of the human ocular lens MIP gene.

    PubMed

    Wang, X Y; Ohtaka-Maruyama, C; Pisano, M M; Jaworski, C J; Chepelinsky, A B

    1995-12-29

    The MIP (major intrinsic protein) gene, a member of an ancient family of membrane channel genes, encodes the predominant fiber cell membrane protein of the ocular lens. Its specific expression in the lens fibers is temporally and spatially regulated during development. To study the regulation of expression of MIP and delineate the regulatory elements underlying its tissue specificity and ontogenic profile, we have cloned 2840 bp of the human MIP 5'-flanking sequence. The human MIP 5'-flanking sequence contains three complete Alu repetitive elements in tandem at position between nt -1699 and -2684 (nt -1699/-2684). These Alu elements appear to have had a complex evolutionary history with insertions at different times. We have fused DNA fragments containing MIP 5'-flanking sequences to the bacterial cat reporter gene encoding chloramphenicol acetyltransferase and assayed them in primary cultures of chicken lens cells. We have mapped two negative regulatory regions in the human MIP 5'-flanking sequences -1564/-1696 and -948/-1000. We demonstrated that the human MIP 5'-flanking sequence -253/+42 contains a functional promoter in lens cells but is inactive in kidney epithelial cells or mouse fibroblasts, suggesting that this sequence contains regulatory elements responsible for the lens-specific expression of MIP.

  3. Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences.

    PubMed

    Jansen, A; Gemayel, R; Verstrepen, K J

    2012-01-01

    Tandem repeats are intrinsically highly variable sequences since repeat units are often lost or gained during replication or following unequal recombination events. Because of their low complexity and their instability, these repeats, which are also called satellite repeats, are often considered to be useless 'junk' DNA. However, recent findings show that tandem repeats are frequently found within promoters of stress-induced genes and within the coding regions of genes encoding cell-surface and regulatory proteins. Interestingly, frequent changes in these repeats often confer phenotypic variability. Examples include variation in the microbial cell surface, rapid tuning of internal molecular clocks in flies, and enhanced morphological plasticity in mammals. This suggests that instead of being useless junk DNA, some variable tandem repeats are useful functional elements that confer 'evolvability', facilitating swift evolution and rapid adaptation to changing environments. Since changes in repeats are frequent and reversible, repeats provide a unique type of mutation that bridges the gap between rare genetic mutations, such as single nucleotide polymorphisms, and highly unstable but reversible epigenetic inheritance.

  4. Fused Regression for Multi-source Gene Regulatory Network Inference

    PubMed Central

    Lam, Kari Y.; Westrick, Zachary M.; Müller, Christian L.; Christiaen, Lionel; Bonneau, Richard

    2016-01-01

    Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms) and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method’s utility in learning from data collected on different experimental platforms. PMID:27923054

  5. Regulatory Regions of the Homeotic Gene Proboscipedia Are Sensitive to Chromosomal Pairing

    PubMed Central

    Kapoun, A. M.; Kaufman, T. C.

    1995-01-01

    We have identified regulatory regions of the homeotic gene proboscipedia that are capable of repressing a linked white minigene in a manner that is sensitive to chromosomal pairing. Normally, the eye color of transformants containing white in a P-element vector is affected by the number of copies of the transgene; homozygous flies have darker eyes than heterozygotes. However, we found that flies homozygous for select pb DNA-containing transgenes had lighter eyes than heterozygotes. Several pb DNA fragments are capable of causing this pairing sensitive (PS) negative regulation of white. Two fragments in the upstream DNA of pb, 0.58 and 0.98 kb, are PS; additionally, two PS sites are located in the second intron, including a 0.5-kb region and 49-bp sequence. This phenotype is not observed when two PS sites are located at different chromosomal insertion sites (in trans-heterozygous transgenic animals), indicating that the pb-DNA-mediated repression of white is dependent on the pairing or proximity of the PS regions. The observed phenomenon is similar to transvection in which certain alleles of a gene can complement each other, but only when homologous chromosomes are paired. Interestingly, the intronic PS regions contain positive regulatory sequences for pb, whereas the upstream PS sites contain pb negative regulatory elements. PMID:7498743

  6. Resolution of gene regulatory conflicts caused by combinations of antibiotics

    PubMed Central

    Bollenbach, Tobias; Kishony, Roy

    2011-01-01

    SUMMARY Regulatory conflicts occur when two signals which individually trigger opposite cellular responses are present simultaneously. Here, we investigate regulatory conflicts in the bacterial response to antibiotic combinations. We use an Escherichia coli promoter-GFP library to study the transcriptional response of many promoters to either additive or antagonistic drug pairs at fine two-dimensional resolution of drug concentration. Surprisingly, we find that this dataset can be characterized as a linear sum of only two principal components. Component one, accounting for over 70% of the response, represents the response to growth inhibition by the drugs. Component two describes how regulatory conflicts are resolved. For the additive drug pair, conflicts are resolved by linearly interpolating the single drug responses, while for the antagonistic drug pair, the growth-limiting drug dominates the response. Importantly, for a given drug pair, the same conflict resolution strategy applies to almost all genes. These results provide a recipe for predicting gene expression responses to antibiotic combinations. PMID:21596308

  7. Polymorphism in the bovine BOLA-DRB3 upstream regulatory regions detected through PCR-SSCP and DNA sequencing.

    PubMed

    Ripoli, M V; Peral-García, P; Dulout, F N; Giovambattista, G

    2004-09-15

    In the present work, we describe through polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) and DNA sequencing the polymorphism within the URR-BoLA-DRB3 in 15 cattle breeds. In total, seven PCR-SSCP defined alleles were detected. The alignment of studied sequences showed six polymorphic sites (four transitions, one transversion and one deletion) in the interconsensus regions of the BoLA-DRB3 upstream regulatory region (URR), while the consensus boxes were invariant. Five out of six detected polymorphic sites were of one nucleotide substitution in the interconsensus regions. It is expected that these mutations do not affect significantly the level of expression. In contrast, the deletion observed in the sequence between CCAAT and TATA boxes could have some effect on affinity interactions between the promoter region and the transcription factors. The URR-BoLA-DRB3 DNA analyzed sequences showed moderate level of nucleotide diversity, high level of identity among them and were grouped in the same clade in the phylogenetic tree. In addition, the phylogenetic tree, the similarity analysis and the sequence structure confirmed that the fragment analyzed in this study corresponds to the URR-BoLA-DRB3. The functional role of the observed polymorphic sites among the regulatory motifs in bovine needs to be analyzed and confirmed by means of gene expression assays.

  8. Regulatory aspects for translating gene therapy research into the clinic.

    PubMed

    Laurencot, Carolyn M; Ruppel, Sheryl

    2009-01-01

    Gene therapy products are highly regulated, therefore moving a promising candidate from the laboratory into the clinic can present unique challenges. Success can only be achieved by proper planning and communication within the clinical development team, as well as consultation with the regulatory scientists who will eventually review the clinical plan. Regulators should not be considered as obstacles but rather as collaborators whose advice can significantly expedite the product development. Sound scientific data is required and reviewed by the regulatory agencies to determine whether the potential benefit to the patient population outweighs the risk. Therefore, compliance with Good Manufacturing Practice (GMP) and Good Laboratory Practice (GLP) principles to ensure quality, safety, purity, and potency of the product, and to establish "proof of concept" for efficacy, and for safety information, respectively, is essential. The design and conduct of the clinical trial must adhere to Good Clinical Practice (GCP) principals. The clinical protocol should contain adequate rationale, supported by nonclinical data, to justify the starting dose and regimen, and adequate safety monitoring based on the patient population and the anticipated toxicities. Proper review and approval of gene therapy clinical studies by numerous committees, and regulatory agencies before and throughout the study allows for ongoing risk assessment of these novel and innovative products. The ethical conduct of clinical trials must be a priority for all clinical investigators and sponsors. As history has shown us, only a few fatal mistakes can dramatically alter the regulation of investigational products for all individuals involved in gene therapy clinical research, and further delay the advancement of gene therapy to licensed medicinal products.

  9. The gene regulatory network for breast cancer: integrated regulatory landscape of cancer hallmarks.

    PubMed

    Emmert-Streib, Frank; de Matos Simoes, Ricardo; Mullan, Paul; Haibe-Kains, Benjamin; Dehmer, Matthias

    2014-01-01

    In this study, we infer the breast cancer gene regulatory network from gene expression data. This network is obtained from the application of the BC3Net inference algorithm to a large-scale gene expression data set consisting of 351 patient samples. In order to elucidate the functional relevance of the inferred network, we are performing a Gene Ontology (GO) analysis for its structural components. Our analysis reveals that most significant GO-terms we find for the breast cancer network represent functional modules of biological processes that are described by known cancer hallmarks, including translation, immune response, cell cycle, organelle fission, mitosis, cell adhesion, RNA processing, RNA splicing and response to wounding. Furthermore, by using a curated list of census cancer genes, we find an enrichment in these functional modules. Finally, we study cooperative effects of chromosomes based on information of interacting genes in the beast cancer network. We find that chromosome 21 is most coactive with other chromosomes. To our knowledge this is the first study investigating the genome-scale breast cancer network.

  10. Core cell cycle regulatory genes in rice and their expression profiles across the growth zone of the leaf.

    PubMed

    Pettkó-Szandtner, A; Cserháti, M; Barrôco, R M; Hariharan, S; Dudits, D; Beemster, G T S

    2015-11-01

    Rice (Oryza sativa L.) as a model and crop plant with a sequenced genome offers an outstanding experimental system for discovering and functionally analyzing the major cell cycle control elements in a cereal species. In this study, we identified the core cell cycle genes in the rice genome through a hidden Markov model search and multiple alignments supported with the use of short protein sequence probes. In total we present 55 rice putative cell cycle genes with locus identity, chromosomal location, approximate chromosome position and EST accession number. These cell cycle genes include nine cyclin dependent-kinase (CDK) genes, 27 cyclin genes, one CKS gene, two RBR genes, nine E2F/DP/DEL genes, six KRP genes, and one WEE gene. We also provide characteristic protein sequence signatures encoded by CDK and cyclin gene variants. Promoter analysis by the FootPrinter program discovered several motifs in the regulatory region of the core cell cycle genes. As a first step towards functional characterization we performed transcript analysis by RT-PCR to determine gene specific variation in transcript levels along the rice leaves. The meristematic zone of the leaves where cells are actively dividing was identified based on kinematic analysis and flow cytometry. As expected, expression of the majority of cell cycle genes was exclusively associated with the meristematic region. However genes such as different D-type cyclins, DEL1, KRP1/3, and RBR2 were also expressed in leaf segments representing the transition zone in which cells start differentiation.

  11. Partitioning of genetic variation between regulatory and coding gene segments: the predominance of software variation in genes encoding introvert proteins.

    PubMed

    Mitchison, A

    1997-01-01

    In considering genetic variation in eukaryotes, a fundamental distinction can be made between variation in regulatory (software) and coding (hardware) gene segments. For quantitative traits the bulk of variation, particularly that near the population mean, appears to reside in regulatory segments. The main exceptions to this rule concern proteins which handle extrinsic substances, here termed extrovert proteins. The immune system includes an unusually large proportion of this exceptional category, but even so its chief source of variation may well be polymorphism in regulatory gene segments. The main evidence for this view emerges from genome scanning for quantitative trait loci (QTL), which in the case of the immune system points to a major contribution of pro-inflammatory cytokine genes. Further support comes from sequencing of major histocompatibility complex (Mhc) class II promoters, where a high level of polymorphism has been detected. These Mhc promoters appear to act, in part at least, by gating the back-signal from T cells into antigen-presenting cells. Both these forms of polymorphism are likely to be sustained by the need for flexibility in the immune response. Future work on promoter polymorphism is likely to benefit from the input from genome informatics.

  12. The first determination of DNA sequence of a specific gene.

    PubMed

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  13. Polymorphism in the upstream regulatory region of DQA1 gene in the Italian population.

    PubMed

    Petronzelli, F; Kimura, A; Ferrante, P; Mazzilli, M C

    1995-04-01

    Polymorphism in the 5'-upstream regulatory region of the DQA1 gene has been recently described. Using PCR-SSO method and SSCP analysis we have investigated this polymorphism in a group of 111 Italian blood donors which had been oligotyped for DRB1, DQA1 and DQB1 genes. Eight allelic variants were detected. Looking at the relationships among QAP sequences and DQA1 and DRB1 genes, three alternative situations were found: 1. a one-to-one relation between QAP and DQA1 alleles, independently of the other class II genes; 2. the same QAP allele in association with different DQA1-DRB1 haplotypes; 3. the same DQA1 allele with different QAP sequences according to the DRB1 specificity. No unexpected associations with DQB1 gene were found. These results must be interpreted considering that DQA1 and DRB1 genes are transcribed in opposite directions so that the promoter region of DQA1 gene lies between DQA1 and DRB1, close to the former but several hundreds kb away from the latter.

  14. Autonomous Boolean modelling of developmental gene regulatory networks

    PubMed Central

    Cheng, Xianrui; Sun, Mengyang; Socolar, Joshua E. S.

    2013-01-01

    During early embryonic development, a network of regulatory interactions among genes dynamically determines a pattern of differentiated tissues. We show that important timing information associated with the interactions can be faithfully represented in autonomous Boolean models in which binary variables representing expression levels are updated in continuous time, and that such models can provide a direct insight into features that are difficult to extract from ordinary differential equation (ODE) models. As an application, we model the experimentally well-studied network controlling fly body segmentation. The Boolean model successfully generates the patterns formed in normal and genetically perturbed fly embryos, permits the derivation of constraints on the time delay parameters, clarifies the logic associated with different ODE parameter sets and provides a platform for studying connectivity and robustness in parameter space. By elucidating the role of regulatory time delays in pattern formation, the results suggest new types of experimental measurements in early embryonic development. PMID:23034351

  15. Optimal finite horizon control in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Liu, Qiuli

    2013-06-01

    As a paradigm for modeling gene regulatory networks, probabilistic Boolean networks (PBNs) form a subclass of Markov genetic regulatory networks. To date, many different stochastic optimal control approaches have been developed to find therapeutic intervention strategies for PBNs. A PBN is essentially a collection of constituent Boolean networks via a probability structure. Most of the existing works assume that the probability structure for Boolean networks selection is known. Such an assumption cannot be satisfied in practice since the presence of noise prevents the probability structure from being accurately determined. In this paper, we treat a case in which we lack the governing probability structure for Boolean network selection. Specifically, in the framework of PBNs, the theory of finite horizon Markov decision process is employed to find optimal constituent Boolean networks with respect to the defined objective functions. In order to illustrate the validity of our proposed approach, an example is also displayed.

  16. Regulatory Factor X (RFX)-mediated transcriptional rewiring of ciliary genes in animals.

    PubMed

    Piasecki, Brian P; Burghoorn, Jan; Swoboda, Peter

    2010-07-20

    Cilia were present in the last eukaryotic common ancestor (LECA) and were retained by most organisms spanning all extant eukaryotic lineages, including organisms in the Unikonta (Amoebozoa, fungi, choanoflagellates, and animals), Archaeplastida, Excavata, Chromalveolata, and Rhizaria. In certain animals, including humans, ciliary gene regulation is mediated by Regulatory Factor X (RFX) transcription factors (TFs). RFX TFs bind X-box promoter motifs and thereby positively regulate >50 ciliary genes. Though RFX-mediated ciliary gene regulation has been studied in several bilaterian animals, little is known about the evolutionary conservation of ciliary gene regulation. Here, we explore the evolutionary relationships between RFX TFs and cilia. By sampling the genome sequences of >120 eukaryotic organisms, we show that RFX TFs are exclusively found in unikont organisms (whether ciliated or not), but are completely absent from the genome sequences of all nonunikont organisms (again, whether ciliated or not). Sampling the promoter sequences of 12 highly conserved ciliary genes from 23 diverse unikont and nonunikont organisms further revealed that phylogenetic footprints of X-box promoter motif sequences are found exclusively in ciliary genes of certain animals. Thus, there is no correlation between cilia/ciliary genes and the presence or absence of RFX TFs and X-box promoter motifs in nonanimal unikont and in nonunikont organisms. These data suggest that RFX TFs originated early in the unikont lineage, distinctly after cilia evolved. The evolutionary model that best explains these observations indicates that the transcriptional rewiring of many ciliary genes by RFX TFs occurred early in the animal lineage.

  17. Nucleotide Sequence of the Akv env Gene

    PubMed Central

    Lenz, Jack; Crowther, Robert; Straceski, Anthony; Haseltine, William

    1982-01-01

    The sequence of 2,191 nucleotides encoding the env gene of murine retrovirus Akv was determined by using a molecular clone of the Akv provirus. Deduction of the encoded amino acid sequence showed that a single open reading frame encodes a 638-amino acid precursor to gp70 and p15E. In addition, there is a typical leader sequence preceding the amino terminus of gp70. The locations of potential glycosylation sites and other structural features indicate that the entire gp70 molecule and most of p15E are located on the outer side of the membrane. Internal cleavage of the env precursor to generate gp70 and p15E occurs immediately adjacent to several basic amino acids at the carboxyl terminus of gp70. This cleavage generates a region of 42 uncharged, relatively hydrophobic amino acids at the amino terminus of p15E, which is located in a position analogous to the hydrophobic membrane fusion sequence of influenza virus hemagglutinin. The mature polypeptides are predicted to associate with the membrane via a region of 30 uncharged, mostly hydrophobic amino acids located near the carboxyl terminus of p15E. Distal to this membrane association region is a sequence of 35 amino acids at the carboxyl terminus of the env precursor, which is predicted to be located on the inner side of the membrane. By analogy to Moloney murine leukemia virus, a proteolytic cleavage in this region removes the terminal 19 amino acids, thus generating the carboxyl terminus of p15E. This leaves 15 amino acids at the carboxyl terminus of p15E on the inner side of the membrane in a position to interact with virion cores during budding. The precise location and order of the large RNase T1-resistant oligonucleotides in the env region were determined and compared with those from several leukemogenic viruses of AKR origin. This permitted a determination of how the differences in the leukemogenic viruses affect the primary structure of the env gene products. PMID:6283170

  18. Strong early seed-specific gene regulatory region

    DOEpatents

    Broun, Pierre; Somerville, Chris

    1999-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  19. Strong early seed-specific gene regulatory region

    SciTech Connect

    Broun, Pierre; Somerville, Chris

    2002-01-01

    Nucleic acid sequences and methods for their use are described which provide for early seed-specific transcription, in order to modulate or modify expression of foreign or endogenous genes in seeds, particularly embryo cells. The method finds particular use in conjunction with modifying fatty acid production in seed tissue.

  20. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  1. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-12-22

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  2. Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data.

    PubMed

    Liu, Zhi-Ping

    2015-02-01

    Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented.

  3. Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data

    PubMed Central

    Liu, Zhi-Ping

    2015-01-01

    Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented. PMID:25937810

  4. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  5. Innovation and robustness in complex regulatory gene networks

    PubMed Central

    Ciliberti, S.; Martin, O. C.; Wagner, A.

    2007-01-01

    The history of life involves countless evolutionary innovations, a steady stream of ingenuity that has been flowing for more than 3 billion years. Very little is known about the principles of biological organization that allow such innovation. Here, we examine these principles for evolutionary innovation in gene expression patterns. To this end, we study a model for the transcriptional regulation networks that are at the heart of embryonic development. A genotype corresponds to a regulatory network of a given topology, and a phenotype corresponds to a steady-state gene expression pattern. Networks with the same phenotype form a connected graph in genotype space, where two networks are immediate neighbors if they differ by one regulatory interaction. We show that an evolutionary search on this graph can reach genotypes that are as different from each other as if they were chosen at random in genotype space, allowing evolutionary access to different kinds of innovation while staying close to a viable phenotype. Thus, although robustness to mutations may hinder innovation in the short term, we conclude that long-term innovation in gene expression patterns can only emerge in the presence of the robustness caused by connected genotype graphs. PMID:17690244

  6. Cloning and characterization of nif structural and regulatory genes in the purple sulfur bacterium, Halorhodospira halophila.

    PubMed

    Tsuihiji, Hisayoshi; Yamazaki, Yoichi; Kamikubo, Hironari; Imamoto, Yasushi; Kataoka, Mikio

    2006-03-01

    Halorhodospira halophila is a halophilic photosynthetic bacterium classified as a purple sulfur bacterium. We found that H. halophila generates hydrogen gas during photoautotrophic growth as a byproduct of a nitrogenase reaction. In order to consider the applied possibilities of this photobiological hydrogen generation, we cloned and characterized the structural and regulatory genes encoding the nitrogenase, nifH, nifD and nifA, from H. halophila. This is the first description of the nif genes for a purple sulfur bacterium. The amino-acid sequences of NifH and NifD indicated that these proteins are an Fe protein and a part of a MoFe protein, respectively. The important residues are conserved completely. The sequence upstream from the nifH region and sequence similarities of nifH and nifD with those of the other organisms suggest that the regulatory system might be a NifL-NifA system; however, H. halophila lacks nifL. The amino-acid sequence of H. halophila NifA is closer to that of the NifA of the NifL-NifA system than to that of NifA without NifL. H. halophila NifA does not conserve either the residue that interacts with NifL or the important residues involved in NifL-independent regulation. These results suggest the existence of yet another regulatory system, and that the development of functional systems and their molecular counterparts are not necessarily correlated throughout evolution. All of these Nif proteins of H. halophila possess an excess of acidic residues, which acts as a salt-resistant mechanism.

  7. Identifying gene regulatory network rewiring using latent differential graphical models

    PubMed Central

    Tian, Dechao; Gu, Quanquan; Ma, Jian

    2016-01-01

    Gene regulatory networks (GRNs) are highly dynamic among different tissue types. Identifying tissue-specific gene regulation is critically important to understand gene function in a particular cellular context. Graphical models have been used to estimate GRN from gene expression data to distinguish direct interactions from indirect associations. However, most existing methods estimate GRN for a specific cell/tissue type or in a tissue-naive way, or do not specifically focus on network rewiring between different tissues. Here, we describe a new method called Latent Differential Graphical Model (LDGM). The motivation of our method is to estimate the differential network between two tissue types directly without inferring the network for individual tissues, which has the advantage of utilizing much smaller sample size to achieve reliable differential network estimation. Our simulation results demonstrated that LDGM consistently outperforms other Gaussian graphical model based methods. We further evaluated LDGM by applying to the brain and blood gene expression data from the GTEx consortium. We also applied LDGM to identify network rewiring between cancer subtypes using the TCGA breast cancer samples. Our results suggest that LDGM is an effective method to infer differential network using high-throughput gene expression data to identify GRN dynamics among different cellular conditions. PMID:27378774

  8. Engineering nucleases for gene targeting: safety and regulatory considerations.

    PubMed

    Pauwels, Katia; Podevin, Nancy; Breyer, Didier; Carroll, Dana; Herman, Philippe

    2014-01-25

    Nuclease-based gene targeting (NBGT) represents a significant breakthrough in targeted genome editing since it is applicable from single-celled protozoa to human, including several species of economic importance. Along with the fast progress in NBGT and the increasing availability of customized nucleases, more data are available about off-target effects associated with the use of this approach. We discuss how NBGT may offer a new perspective for genetic modification, we address some aspects crucial for a safety improvement of the corresponding techniques and we also briefly relate the use of NBGT applications and products to the regulatory oversight.

  9. Regulatory Architecture of Gene Expression Variation in the Threespine Stickleback Gasterosteus aculeatus

    PubMed Central

    Pritchard, Victoria L.; Viitaniemi, Heidi M.; McCairns, R. J. Scott; Merilä, Juha; Nikinmaa, Mikko; Primmer, Craig R.; Leder, Erica H.

    2016-01-01

    Much adaptive evolutionary change is underlain by mutational variation in regions of the genome that regulate gene expression rather than in the coding regions of the genes themselves. An understanding of the role of gene expression variation in facilitating local adaptation will be aided by an understanding of underlying regulatory networks. Here, we characterize the genetic architecture of gene expression variation in the threespine stickleback (Gasterosteus aculeatus), an important model in the study of adaptive evolution. We collected transcriptomic and genomic data from 60 half-sib families using an expression microarray and genotyping-by-sequencing, and located expression quantitative trait loci (eQTL) underlying the variation in gene expression in liver tissue using an interval mapping approach. We identified eQTL for several thousand expression traits. Expression was influenced by polymorphism in both cis- and trans-regulatory regions. Trans-eQTL clustered into hotspots. We did not identify master transcriptional regulators in hotspot locations: rather, the presence of hotspots may be driven by complex interactions between multiple transcription factors. One observed hotspot colocated with a QTL recently found to underlie salinity tolerance in the threespine stickleback. However, most other observed hotspots did not colocate with regions of the genome known to be involved in adaptive divergence between marine and freshwater habitats. PMID:27836907

  10. LOESS correction for length variation in gene set-based genomic sequence analysis

    PubMed Central

    Aboukhalil, Anton; Bulyk, Martha L.

    2012-01-01

    Motivation: Sequence analysis algorithms are often applied to sets of DNA, RNA or protein sequences to identify common or distinguishing features. Controlling for sequence length variation is critical to properly score sequence features and identify true biological signals rather than length-dependent artifacts. Results: Several cis-regulatory module discovery algorithms exhibit a substantial dependence between DNA sequence score and sequence length. Our newly developed LOESS method is flexible in capturing diverse score-length relationships and is more effective in correcting DNA sequence scores for length-dependent artifacts, compared with four other approaches. Application of this method to genes co-expressed during Drosophila melanogaster embryonic mesoderm development or neural development scored by the Lever motif analysis algorithm resulted in successful recovery of their biologically validated cis-regulatory codes. The LOESS length-correction method is broadly applicable, and may be useful not only for more accurate inference of cis-regulatory codes, but also for detection of other types of patterns in biological sequences. Availability: Source code and compiled code are available from http://thebrain.bwh.harvard.edu/LM_LOESS/ Contact: mlbulyk@receptor.med.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22492312

  11. Coupled enhancer and coding sequence evolution of a homeobox gene shaped leaf diversity

    PubMed Central

    Vuolo, Francesco; Mentink, Remco A.; Hajheidari, Mohsen; Bailey, C. Donovan; Filatov, Dmitry A.; Tsiantis, Miltos

    2016-01-01

    Here we investigate mechanisms underlying the diversification of biological forms using crucifer leaf shape as an example. We show that evolution of an enhancer element in the homeobox gene REDUCED COMPLEXITY (RCO) altered leaf shape by changing gene expression from the distal leaf blade to its base. A single amino acid substitution evolved together with this regulatory change, which reduced RCO protein stability, preventing pleiotropic effects caused by its altered gene expression. We detected hallmarks of positive selection in these evolved regulatory and coding sequence variants and showed that modulating RCO activity can improve plant physiological performance. Therefore, interplay between enhancer and coding sequence evolution created a potentially adaptive path for morphological evolution. PMID:27852629

  12. Gene regulatory networks governing haematopoietic stem cell development and identity.

    PubMed

    Pimanda, John E; Göttgens, Berthold

    2010-01-01

    Development can be viewed as a dynamic progression through regulatory states which characterise the various cell types within a given differentiation cascade. To understand the progression of regulatory states that define the origin and subsequent development of haematopoietic stem cells, the first imperative is to understand the ontogeny of haematopoiesis. We are fortunate that the ontogeny of blood development is one of the best characterized mammalian developmental systems. However, the field is still in its infancy with regard to the reconstruction of gene regulatory networks and their interactions with cell signalling cascades that drive a mesodermal progenitor to adopt the identity of a haematopoietic stem cell and beyond. Nevertheless, a framework to dissect these networks and comprehend the logic of its circuitry does exist and although they may not as yet be available, a sense for the tools that will be required to achieve this aim is also emerging. In this review we cover the fundamentals of network architecture, methods used to reconstruct networks, current knowledge of haematopoietic and related transcriptional networks, current challenges and future outlook.

  13. Selection for distinct gene expression properties favours the evolution of mutational robustness in gene regulatory networks.

    PubMed

    Espinosa-Soto, C

    2016-11-01

    Mutational robustness is a genotype's tendency to keep a phenotypic trait with little and few changes in the face of mutations. Mutational robustness is both ubiquitous and evolutionarily important as it affects in different ways the probability that new phenotypic variation arises. Understanding the origins of robustness is specially relevant for systems of development that are phylogenetically widespread and that construct phenotypic traits with a strong impact on fitness. Gene regulatory networks are examples of this class of systems. They comprise sets of genes that, through cross-regulation, build the gene activity patterns that define cellular responses, different tissues or distinct cell types. Several empirical observations, such as a greater robustness of wild-type phenotypes, suggest that stabilizing selection underlies the evolution of mutational robustness. However, the role of selection in the evolution of robustness is still under debate. Computer simulations of the dynamics and evolution of gene regulatory networks have shown that selection for any gene activity pattern that is steady and self-sustaining is sufficient to promote the evolution of mutational robustness. Here, I generalize this scenario using a computational model to show that selection for different aspects of a gene activity phenotype increases mutational robustness. Mutational robustness evolves even when selection favours properties that conflict with the stationarity of a gene activity pattern. The results that I present support an important role for stabilizing selection in the evolution of robustness in gene regulatory networks.

  14. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  15. Detection and Visualization of Compositionally Similar cis-Regulatory Element Clusters in Orthologous and Coordinately Controlled Genes

    PubMed Central

    Jegga, Anil G.; Sherwood, Shawn P.; Carman, James W.; Pinski, Andrew T.; Phillips, Jerry L.; Pestian, John P.; Aronow, Bruce J.

    2002-01-01

    Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions. PMID:12213778

  16. Detection and visualization of compositionally similar cis-regulatory element clusters in orthologous and coordinately controlled genes.

    PubMed

    Jegga, Anil G; Sherwood, Shawn P; Carman, James W; Pinski, Andrew T; Phillips, Jerry L; Pestian, John P; Aronow, Bruce J

    2002-09-01

    Evolutionarily conserved noncoding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. However, detecting and visualizing compositionally similar cis-element clusters in the context of conserved sequences is challenging. We have explored potential solutions and developed an algorithm and visualization method that combines the results of conserved sequence analyses (BLASTZ) with those of transcription factor binding site analyses (MatInspector) (http://trafac.chmcc.org). We define hits as the density of co-occurring cis-element transcription factor (TF)-binding sites measured within a 200-bp moving average window through phylogenetically conserved regions. The results are depicted as a Regulogram, in which the hit count is plotted as a function of position within each of the two genomic regions of the aligned orthologs. Within a high-scoring region, the relative arrangement of shared cis-elements within compositionally similar TF-binding site clusters is depicted in a Trafacgram. On the basis of analyses of several training data sets, the approach also allows for the detection of similarities in composition and relative arrangement of cis-element clusters within nonorthologous genes, promoters, and enhancers that exhibit coordinate regulatory properties. Known functional regulatory regions of nonorthologous and less-conserved orthologous genes frequently showed cis-element shuffling, demonstrating that compositional similarity can be more sensitive than sequence similarity. These results show that combining sequence similarity with cis-element compositional similarity provides a powerful aid for the identification of potential control regions.

  17. A Trans-Acting Regulatory Gene That Inversely Affects the Expression of the White, Brown and Scarlet Loci in Drosophila

    PubMed Central

    Rabinow, L.; Nguyen-Huynh, A. T.; Birchler, J. A.

    1991-01-01

    A trans-acting regulatory gene, Inr-a, that alters the level of expression of the white eye color locus as an inverse function of the number of its functional copies is described. Several independent lines of evidence demonstrate that this regulatory gene interacts with white via the promoter sequences. Among these are the observations that the inverse regulatory effect is conferred to the Adh gene when fused to the white promoter and that cis-regulatory mutants of white fail to respond. The phenotypic response to Inr-a is found in all tissues in which white is expressed, and mutants of the regulator exhibit a recessive lethality during larval periods. Increased white messenger RNA levels in pupal stages are found in Inr-a/+ individuals versus +/+ and a coordinate response is observed for mRNA levels from the brown and scarlet loci. All are structurally related and participate in pigment deposition. These experiments demonstrate that a single regulatory gene can exert an inverse effect on a target structural locus, a situation postulated from segmental aneuploid studies of gene expression and dosage compensation. PMID:1743487

  18. Diverse Gene Expression in Human Regulatory T Cell Subsets Uncovers Connection between Regulatory T Cell Genes and Suppressive Function.

    PubMed

    Hua, Jing; Davis, Scott P; Hill, Jonathan A; Yamagata, Tetsuya

    2015-10-15

    Regulatory T (Treg) cells have a critical role in the control of immunity, and their diverse subpopulations may allow adaptation to different types of immune responses. In this study, we analyzed human Treg cell subpopulations in the peripheral blood by performing genome-wide expression profiling of 40 Treg cell subsets from healthy donors. We found that the human peripheral blood Treg cell population is comprised of five major genomic subgroups, represented by 16 tractable subsets with a particular cell surface phenotype. These subsets possess a range of suppressive function and cytokine secretion and can exert a genomic footprint on target effector T (Teff) cells. Correlation analysis of variability in gene expression in the subsets identified several cell surface molecules associated with Treg suppressive function, and pharmacological interrogation revealed a set of genes having causative effect. The five genomic subgroups of Treg cells imposed a preserved pattern of gene expression on Teff cells, with a varying degree of genes being suppressed or induced. Notably, there was a cluster of genes induced by Treg cells that bolstered an autoinhibitory effect in Teff cells, and this induction appears to be governed by a different set of genes than ones involved in counteracting Teff activation. Our work shows an example of exploiting the diversity within human Treg cell subpopulations to dissect Treg cell biology.

  19. Reverse engineering of gene regulatory networks: a comparative study.

    PubMed

    Hache, Hendrik; Lehrach, Hans; Herwig, Ralf

    2009-01-01

    Reverse engineering of gene regulatory networks has been an intensively studied topic in bioinformatics since it constitutes an intermediate step from explorative to causative gene expression analysis. Many methods have been proposed through recent years leading to a wide range of mathematical approaches. In practice, different mathematical approaches will generate different resulting network structures, thus, it is very important for users to assess the performance of these algorithms. We have conducted a comparative study with six different reverse engineering methods, including relevance networks, neural networks, and Bayesian networks. Our approach consists of the generation of defined benchmark data, the analysis of these data with the different methods, and the assessment of algorithmic performances by statistical analyses. Performance was judged by network size and noise levels. The results of the comparative study highlight the neural network approach as best performing method among those under study.

  20. Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes in yeast.

    PubMed

    Schaefke, Bernhard; Emerson, J J; Wang, Tzi-Yuan; Lu, Mei-Yeh Jade; Hsieh, Li-Ching; Li, Wen-Hsiung

    2013-09-01

    Gene expression evolution can be caused by changes in cis- or trans-regulatory elements or both. As cis and trans regulation operate through different molecular mechanisms, cis and trans mutations may show different inheritance patterns and may be subjected to different selective constraints. To investigate these issues, we obtained and analyzed gene expression data from two Saccharomyces cerevisiae strains and their hybrid, using high-throughput sequencing. Our data indicate that compared with other types of genes, those with antagonistic cis-trans interactions are more likely to exhibit over- or underdominant inheritance of expression level. Moreover, in accordance with previous studies, genes with trans variants tend to have a dominant inheritance pattern, whereas cis variants are enriched for additive inheritance. In addition, cis regulatory differences contribute more to expression differences between species than within species, whereas trans regulatory differences show a stronger association between divergence and polymorphism. Our data indicate that in the trans component of gene expression differences genes subjected to weaker selective constraints tend to have an excess of polymorphism over divergence compared with those subjected to stronger selective constraints. In contrast, in the cis component, this difference between genes under stronger and weaker selective constraint is mostly absent. To explain these observations, we propose that purifying selection more strongly shapes trans changes than cis changes and that positive selection may have significantly contributed to cis regulatory divergence.

  1. Maps of open chromatin highlight cell type-restricted patterns of regulatory sequence variation at hematological trait loci.

    PubMed

    Paul, Dirk S; Albers, Cornelis A; Rendon, Augusto; Voss, Katrin; Stephens, Jonathan; van der Harst, Pim; Chambers, John C; Soranzo, Nicole; Ouwehand, Willem H; Deloukas, Panos

    2013-07-01

    Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein-coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site, and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.

  2. Screening in silico predicted remotely acting NF1 gene regulatory elements for mutations in patients with neurofibromatosis type 1.

    PubMed

    Hamby, Stephen E; Reviriego, Pablo; Cooper, David N; Upadhyaya, Meena; Chuzhanova, Nadia

    2013-08-15

    Neurofibromatosis type 1 (NF1), a neuroectodermal disorder, is caused by germline mutations in the NF1 gene. NF1 affects approximately 1/3,000 individuals worldwide, with about 50% of cases representing de novo mutations. Although the NF1 gene was identified in 1990, the underlying gene mutations still remain undetected in a small but obdurate minority of NF1 patients. We postulated that in these patients, hitherto undetected pathogenic mutations might occur in regulatory elements far upstream of the NF1 gene. In an attempt to identify such remotely acting regulatory elements, we reasoned that some of them might reside within DNA sequences that (1) have the potential to interact at distance with the NF1 gene and (2) lie within a histone H3K27ac-enriched region, a characteristic of active enhancers. Combining Hi-C data, obtained by means of the chromosome conformation capture technique, with data on the location and level of histone H3K27ac enrichment upstream of the NF1 gene, we predicted in silico the presence of two remotely acting regulatory regions, located, respectively, approximately 600 kb and approximately 42 kb upstream of the NF1 gene. These regions were then sequenced in 47 NF1 patients in whom no mutations had been found in either the NF1 or SPRED1 gene regions. Five patients were found to harbour DNA sequence variants in the distal H3K27ac-enriched region. Although these variants are of uncertain pathological significance and still remain to be functionally characterized, this approach promises to be of general utility for the detection of mutations underlying other inherited disorders that may be caused by mutations in remotely acting regulatory elements.

  3. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks.

    PubMed

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-06-01

    The diverse, specialized genes present in today's lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins' binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes' evolutionary properties. Slowly evolving ("cold"), old genes tend to interact with each other, as do rapidly evolving ("hot"), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN's community structures and its genes' evolutionary properties provide new perspectives for understanding evolutionary genetics.

  4. Neurogenic gene regulatory pathways in the sea urchin embryo.

    PubMed

    Wei, Zheng; Angerer, Lynne M; Angerer, Robert C

    2016-01-15

    During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo.

  5. Toxin-mediated gene regulatory mechanism in Staphylococcus aureus

    PubMed Central

    Joo, Hwang-Soo; Otto, Michael

    2016-01-01

    The dangerous human pathogen Staphylococcus aureus relies heavily on toxins to cause disease, but toxin production can put a strong burden on the bacteria’s energy balance. Thus, controlling the synthesis of proteins solely needed in times of toxin production represents a way for the bacteria to avoid wasting energy. One hypothetical manner to accomplish this sort of regulation is by gene regulatory functions of the toxins themselves. There have been several reports about gene regulation by toxins in S. aureus, but these were never verified on the molecular level. In our study published in MBio [Joo et al., 7(5). pii: e01579-16], we show that phenol-soluble modulins (PSMs), important peptide toxins of S. aureus, release a repressor from the promoter of the operon encoding the toxin export system, thereby enabling toxin secretion. This study describes the first molecular regulatory mechanism exerted by an S. aureus toxin, setting a paradigmatic example of how S. aureus toxins may influence cell functions to adjust them to times of toxin production.

  6. Neurogenic gene regulatory pathways in the sea urchin embryo

    PubMed Central

    Wei, Zheng; Angerer, Lynne M.; Angerer, Robert C.

    2016-01-01

    During embryogenesis the sea urchin early pluteus larva differentiates 40-50 neurons marked by expression of the pan-neural marker synaptotagmin B (SynB) that are distributed along the ciliary band, in the apical plate and pharyngeal endoderm, and 4-6 serotonergic neurons that are confined to the apical plate. Development of all neurons has been shown to depend on the function of Six3. Using a combination of molecular screens and tests of gene function by morpholino-mediated knockdown, we identified SoxC and Brn1/2/4, which function sequentially in the neurogenic regulatory pathway and are also required for the differentiation of all neurons. Misexpression of Brn1/2/4 at low dose caused an increase in the number of serotonin-expressing cells and at higher dose converted most of the embryo to a neurogenic epithelial sphere expressing the Hnf6 ciliary band marker. A third factor, Z167, was shown to work downstream of the Six3 and SoxC core factors and to define a branch specific for the differentiation of serotonergic neurons. These results provide a framework for building a gene regulatory network for neurogenesis in the sea urchin embryo. PMID:26657764

  7. Sequence-Modified Antibiotic Resistance Genes Provide Sustained Plasmid-Mediated Transgene Expression in Mammals.

    PubMed

    Lu, Jiamiao; Zhang, Feijie; Fire, Andrew Z; Kay, Mark A

    2017-03-30

    Conventional plasmid vectors are incapable of achieving sustained levels of transgene expression in vivo even in quiescent mammalian tissues because the transgene expression cassette is silenced. Transcriptional silencing results from the presence of the bacterial plasmid backbone or virtually any DNA sequence of >1 kb in length placed outside of the expression cassette. Here, we show that transcriptional silencing can be substantially forestalled by increasing the An/Tn sequence composition in the plasmid bacterial backbone. Increasing numbers of An/Tn sequences increased sustained transcription of both backbone sequences and adjacent expression cassettes. In order to recapitulate these expression profiles in compact and portable plasmid DNA backbones, we engineered the standard kanamycin or ampicillin antibiotic resistance genes, optimizing the number of An/Tn sequence without altering the encoded amino acids. The resulting vector backbones yield sustained transgene expression from mouse liver, providing generic DNA vectors capable of sustained transgene expression without additional genes or mammalian regulatory elements.

  8. The MYB98 subcircuit of the synergid gene regulatory network includes genes directly and indirectly regulated by MYB98.

    PubMed

    Punwani, Jayson A; Rabiger, David S; Lloyd, Alan; Drews, Gary N

    2008-08-01

    The female gametophyte contains two synergid cells that play a role in many steps of the angiosperm reproductive process, including pollen tube guidance. At their micropylar poles, the synergid cells have a thickened and elaborated cell wall: the filiform apparatus that is thought to play a role in the secretion of the pollen tube attractant(s). MYB98 regulates an important subcircuit of the synergid gene regulatory network (GRN) that functions to activate the expression of genes required for pollen tube guidance and filiform apparatus formation. The MYB98 subcircuit comprises at least 83 downstream genes, including 48 genes within four gene families (CRP810, CRP3700, CRP3730 and CRP3740) that encode Cys-rich proteins. We show that the 11 CRP3700 genes, which include DD11 and DD18, are regulated by a common cis-element, GTAACNT, and that a multimer of this sequence confers MYB98-dependent synergid expression. The GTAACNT element contains the MYB98-binding site identified in vitro, suggesting that the 11 CRP3700 genes are direct targets of MYB98. We also show that five of the CRP810 genes, which include DD2, lack a functional GTAACNT element, suggesting that they are not directly regulated by MYB98. In addition, we show that the five CRP810 genes are regulated by the cis-element AACGT, and that a multimer of this sequence confers synergid expression. Together, these results suggest that the MYB98 branch of the synergid GRN is multi-tiered and, therefore, contains at least one additional downstream transcription factor.

  9. Gene regulatory effects of disease-associated variation in the NRF2 network.

    PubMed

    Lacher, Sarah E; Slattery, Matthew

    2016-12-01

    Reactive oxygen species (ROS), which are both a natural byproduct of oxidative metabolism and an undesirable byproduct of many environmental stressors, can damage all classes of cellular macromolecules and promote diseases from cancer to neurodegeneration. The actions of ROS are mitigated by the transcription factor NRF2, which regulates expression of antioxidant genes via its interaction with cis-regulatory antioxidant response elements (AREs). However, despite the seemingly straightforward relationship between the opposing forces of ROS and NRF2, regulatory precision in the NRF2 network is essential. Genetic variants that alter NRF2 stability or alter ARE sequences have been linked to a range of diseases. NRF2 hyperactivating mutations are associated with tumorigenesis. On the subtler end of the spectrum, single nucleotide variants (SNVs) that alter individual ARE sequences have been linked to neurodegenerative disorders including progressive supranuclear palsy and Parkinson's disease, as well as other diseases. Although the human health implications of NRF2 dysregulation have been recognized for some time, a systems level view of this regulatory network is beginning to highlight key NRF2-targeted AREs consistently associated with disease.

  10. Graphlet Based Metrics for the Comparison of Gene Regulatory Networks

    PubMed Central

    Martin, Alberto J. M.; Dominguez, Calixto; Contreras-Riquelme, Sebastián; Holmes, David S.; Perez-Acle, Tomas

    2016-01-01

    Understanding the control of gene expression remains one of the main challenges in the post-genomic era. Accordingly, a plethora of methods exists to identify variations in gene expression levels. These variations underlay almost all relevant biological phenomena, including disease and adaptation to environmental conditions. However, computational tools to identify how regulation changes are scarce. Regulation of gene expression is usually depicted in the form of a gene regulatory network (GRN). Structural changes in a GRN over time and conditions represent variations in the regulation of gene expression. Like other biological networks, GRNs are composed of basic building blocks called graphlets. As a consequence, two new metrics based on graphlets are proposed in this work: REConstruction Rate (REC) and REC Graphlet Degree (RGD). REC determines the rate of graphlet similarity between different states of a network and RGD identifies the subset of nodes with the highest topological variation. In other words, RGD discerns how th GRN was rewired. REC and RGD were used to compare the local structure of nodes in condition-specific GRNs obtained from gene expression data of Escherichia coli, forming biofilms and cultured in suspension. According to our results, most of the network local structure remains unaltered in the two compared conditions. Nevertheless, changes reported by RGD necessarily imply that a different cohort of regulators (i.e. transcription factors (TFs)) appear on the scene, shedding light on how the regulation of gene expression occurs when E. coli transits from suspension to biofilm. Consequently, we propose that both metrics REC and RGD should be adopted as a quantitative approach to conduct differential analyses of GRNs. A tool that implements both metrics is available as an on-line web server (http://dlab.cl/loto). PMID:27695050

  11. Graphlet Based Metrics for the Comparison of Gene Regulatory Networks.

    PubMed

    Martin, Alberto J M; Dominguez, Calixto; Contreras-Riquelme, Sebastián; Holmes, David S; Perez-Acle, Tomas

    2016-01-01

    Understanding the control of gene expression remains one of the main challenges in the post-genomic era. Accordingly, a plethora of methods exists to identify variations in gene expression levels. These variations underlay almost all relevant biological phenomena, including disease and adaptation to environmental conditions. However, computational tools to identify how regulation changes are scarce. Regulation of gene expression is usually depicted in the form of a gene regulatory network (GRN). Structural changes in a GRN over time and conditions represent variations in the regulation of gene expression. Like other biological networks, GRNs are composed of basic building blocks called graphlets. As a consequence, two new metrics based on graphlets are proposed in this work: REConstruction Rate (REC) and REC Graphlet Degree (RGD). REC determines the rate of graphlet similarity between different states of a network and RGD identifies the subset of nodes with the highest topological variation. In other words, RGD discerns how th GRN was rewired. REC and RGD were used to compare the local structure of nodes in condition-specific GRNs obtained from gene expression data of Escherichia coli, forming biofilms and cultured in suspension. According to our results, most of the network local structure remains unaltered in the two compared conditions. Nevertheless, changes reported by RGD necessarily imply that a different cohort of regulators (i.e. transcription factors (TFs)) appear on the scene, shedding light on how the regulation of gene expression occurs when E. coli transits from suspension to biofilm. Consequently, we propose that both metrics REC and RGD should be adopted as a quantitative approach to conduct differential analyses of GRNs. A tool that implements both metrics is available as an on-line web server (http://dlab.cl/loto).

  12. Evolution of gene regulatory network architectures: examples of subcircuit conservation and plasticity between classes of echinoderms.

    PubMed

    Hinman, Veronica F; Yankura, Kristen A; McCauley, Brenna S

    2009-04-01

    Developmental gene regulatory networks (GRNs) explain how regulatory states are established in particular cells during development and how these states then determine the final form of the embryo. Evolutionary changes to the sequence of the genome will direct reorganization of GRN architectures, which in turn will lead to the alteration of developmental programs. A comparison of GRN architectures must consequently reveal the molecular basis for the evolution of developmental programs among different organisms. This review highlights some of the important findings that have emerged from the most extensive direct comparison of GRN architectures to date. Comparison of the orthologous GRNs for endomesodermal specification in the sea urchin and sea star, provides examples of several discrete, functional GRN subcircuits and shows that they are subject to diverse selective pressures. This demonstrates that different regulatory linkages may be more or less amenable to evolutionary change. One of the more surprising findings from this comparison is that GRN-level functions may be maintained while the factors performing the functions have changed, suggesting that GRNs have a high capacity for compensatory changes involving transcription factor binding to cis regulatory modules.

  13. Target mimics: an embedded layer of microRNA-involved gene regulatory networks in plants

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) play an essential role in gene regulation in plants. At the same time, the expression of miRNA genes is also tightly controlled. Recently, a novel mechanism called “target mimicry” was discovered, providing another layer for modulating miRNA activities. However, except for the artificial target mimics manipulated for functional studies on certain miRNA genes, only one example, IPS1 (Induced by Phosphate Starvation 1)—miR399 was experimentally confirmed in planta. To date, few analyses for comprehensive identification of natural target mimics have been performed in plants. Thus, limited evidences are available to provide detailed information for interrogating the questionable issue whether target mimicry was widespread in planta, and implicated in certain biological processes. Results In this study, genome-wide computational prediction of endogenous miRNA mimics was performed in Arabidopsis and rice, and dozens of target mimics were identified. In contrast to a recent report, the densities of target mimic sites were found to be much higher within the untranslated regions (UTRs) when compared to those within the coding sequences (CDSs) in both plants. Some novel sequence characteristics were observed for the miRNAs that were potentially regulated by the target mimics. GO (Gene Ontology) term enrichment analysis revealed some functional insights into the predicted mimics. After degradome sequencing data-based identification of miRNA targets, the regulatory networks constituted by target mimics, miRNAs and their downstream targets were constructed, and some intriguing subnetworks were further exploited. Conclusions These results together suggest that target mimicry may be widely implicated in regulating miRNA activities in planta, and we hope this study could expand the current understanding of miRNA-involved regulatory networks. PMID:22613869

  14. Implications of Developmental Gene Regulatory Networks Inside and Outside Developmental Biology.

    PubMed

    Peter, Isabelle S; Davidson, Eric H

    2016-01-01

    The insight that the genomic control of developmental process is encoded in the form of gene regulatory networks has profound impacts on many areas of modern bioscience. Most importantly, it affects developmental biology itself, as it means that a causal understanding of development requires knowledge of the architecture of regulatory network interactions. Furthermore, it follows that functional changes in developmental gene regulatory networks have to be considered as a primary mechanism for evolutionary process. We here discuss some of the recent advances in gene regulatory network biology and how they have affected our current understanding of development, evolution, and regulatory genomics.

  15. Gene and translation initiation site prediction in metagenomic sequences

    SciTech Connect

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John; Uberbacher, Edward C

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  16. Enhancing gene regulatory network inference through data integration with markov random fields.

    PubMed

    Banf, Michael; Rhee, Seung Y

    2017-02-01

    A gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization scheme to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE's potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.

  17. Enhancing gene regulatory network inference through data integration with markov random fields

    PubMed Central

    Banf, Michael; Rhee, Seung Y.

    2017-01-01

    A gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization scheme to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation. PMID:28145456

  18. Enhancing gene regulatory network inference through data integration with markov random fields

    DOE PAGES

    Banf, Michael; Rhee, Seung Y.

    2017-02-01

    Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less

  19. Putative cis-regulatory elements in genes highly expressed in rice sperm cells

    PubMed Central

    2011-01-01

    Background The male germ line in flowering plants is initiated within developing pollen grains via asymmetric division. The smaller cell then becomes totally encased within a much larger vegetative cell, forming a unique "cell within a cell structure". The generative cell subsequently divides to give rise to two non-motile diminutive sperm cells, which take part in double fertilization and lead to the seed set. Sperm cells are difficult to investigate because of their presence within the confines of the larger vegetative cell. However, recently developed techniques for the isolation of rice sperm cells and the fully annotated rice genome sequence have allowed for the characterization of the transcriptional repertoire of sperm cells. Microarray gene expression data has identified a subset of rice genes that show unique or highly preferential expression in sperm cells. This information has led to the identification of cis-regulatory elements (CREs), which are conserved in sperm-expressed genes and are putatively associated with the control of cell-specific expression. Findings We aimed to identify the CREs associated with rice sperm cell-specific gene expression data using in silico prediction tools. We analyzed 1-kb upstream regions of the top 40 sperm cell co-expressed genes for over-represented conserved and novel motifs. Analysis of upstream regions with the SIGNALSCAN program with the PLACE database, MEME and the Mclip tool helped to find combinatorial sets of known transcriptional factor-binding sites along with two novel motifs putatively associated with the co-expression of sperm cell-specific genes. Conclusions Our data shows the occurrence of novel motifs, which are putative CREs and are likely targets of transcriptional factors regulating sperm cell gene expression. These motifs can be used to design the experimental verification of regulatory elements and the identification of transcriptional factors that regulate sperm cell-specific gene expression. PMID

  20. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

    DOE PAGES

    Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina; ...

    2015-11-10

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less

  1. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

    SciTech Connect

    Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina; Tringe, Susannah G.; McMahon, Katherine D.

    2015-11-10

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.

  2. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

    PubMed Central

    Oyserman, Ben O; Noguera, Daniel R; del Rio, Tijana Glavina; Tringe, Susannah G; McMahon, Katherine D

    2016-01-01

    Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobic acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. This analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms. PMID:26555245

  3. The molecular evolution of terminal ear1, a regulatory gene in the genus Zea.

    PubMed Central

    White, S E; Doebley, J F

    1999-01-01

    Nucleotide diversity in the terminal ear1 (te1) gene, a regulatory locus hypothesized to be involved in the morphological evolution of maize (Zea mays ssp. mays), was investigated for evidence of past selection. Nucleotide polymorphism in a 1.4-kb region of te1 was analyzed for a sample of 26 sequences isolated from 12 maize lines, five populations of the maize progenitor, Z. mays ssp. parviglumis, six other Zea populations, and two Tripsacum species. Although nucleotide diversity in te1 in maize is reduced relative to ssp. parviglumis, phylogenetic and statistical analyses of the pattern of polymorphism among these sequences provided no evidence of past selection, indicating that the region of the gene studied was probably not involved in maize evolution. The level of reduction in genetic diversity in te1 in maize relative to its progenitor is comparable to that found in previous reports for isozymes and other neutrally evolving maize genes and is consistent with a genome-wide reduction of genetic diversity resulting from a domestication bottleneck. An estimate of the age (1.2-1.4 million yr) of the maize gene pool based on te1 is roughly consistent with previous estimates based on other neutral genes, but may be biased by the apparently slow synonymous substitution rate at te1. PMID:10545473

  4. An Arabidopsis gene regulatory network for secondary cell wall synthesis

    DOE PAGES

    Taylor-Teeples, M.; Lin, L.; de Lucas, M.; ...

    2014-12-24

    The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. In this paper, we present a protein–DNA network between Arabidopsis thaliana transcription factors and secondary cell wall metabolic genes with gene expression regulated bymore » a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. Finally, these interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.« less

  5. Identification of a gene regulatory network associated with prion replication

    PubMed Central

    Marbiah, Masue M; Harvey, Anna; West, Billy T; Louzolo, Anais; Banerjee, Priya; Alden, Jack; Grigoriadis, Anita; Hummerich, Holger; Kan, Ho-Man; Cai, Ying; Bloom, George S; Jat, Parmjit; Collinge, John; Klöhn, Peter-Christian

    2014-01-01

    Prions consist of aggregates of abnormal conformers of the cellular prion protein (PrPC). They propagate by recruiting host-encoded PrPC although the critical interacting proteins and the reasons for the differences in susceptibility of distinct cell lines and populations are unknown. We derived a lineage of cell lines with markedly differing susceptibilities, unexplained by PrPC expression differences, to identify such factors. Transcriptome analysis of prion-resistant revertants, isolated from highly susceptible cells, revealed a gene expression signature associated with susceptibility and modulated by differentiation. Several of these genes encode proteins with a role in extracellular matrix (ECM) remodelling, a compartment in which disease-related PrP is deposited. Silencing nine of these genes significantly increased susceptibility. Silencing of Papss2 led to undersulphated heparan sulphate and increased PrPC deposition at the ECM, concomitantly with increased prion propagation. Moreover, inhibition of fibronectin 1 binding to integrin α8 by RGD peptide inhibited metalloproteinases (MMP)-2/9 whilst increasing prion propagation. In summary, we have identified a gene regulatory network associated with prion propagation at the ECM and governed by the cellular differentiation state. PMID:24843046

  6. Developmental gene regulatory network evolution: insights from comparative studies in echinoderms.

    PubMed

    Hinman, Veronica F; Cheatle Jarvela, Alys M

    2014-03-01

    One of the central concerns of Evolutionary Developmental biology is to understand how the specification of cell types can change during evolution. In the last decade, developmental biology has progressed toward a systems level understanding of cell specification processes. In particular, the focus has been on determining the regulatory interactions of the repertoire of genes that make up gene regulatory networks (GRNs). Echinoderms provide an extraordinary model system for determining how GRNs evolve. This review highlights the comparative GRN analyses arising from the echinoderm system. This work shows that certain types of GRN subcircuits or motifs, i.e., those involving positive feedback, tend to be conserved and may provide a constraint on development. This conservation may be due to a required arrangement of transcription factor binding sites in cis regulatory modules. The review will also discuss ways in which novelty may arise, in particular through the co-option of regulatory genes and subcircuits. The development of the sea urchin larval skeleton, a novel feature that arose in echinoderms, has provided a model for study of co-option mechanisms. Finally, the types of GRNs that can permit the great diversity in the patterns of ciliary bands and their associated neurons found among these taxa are discussed. The availability of genomic resources is rapidly expanding for echinoderms, including genome sequences not only for multiple species of sea urchins but also a species of sea star, sea cucumber, and brittle star. This will enable echinoderms to become a particularly powerful system for understanding how developmental GRNs evolve.

  7. The influence of assortativity on the robustness and evolvability of gene regulatory networks upon gene birth

    PubMed Central

    Pechenick, Dov A.; Moore, Jason H.; Payne, Joshua L.

    2013-01-01

    Gene regulatory networks (GRNs) represent the interactions between genes and gene products, which drive the gene expression patterns that produce cellular phenotypes. GRNs display a number of characteristics that are beneficial for the development and evolution of organisms. For example, they are often robust to genetic perturbation, such as mutations in regulatory regions or loss of gene function. Simultaneously, GRNs are often evolvable as these genetic perturbations are occasionally exploited to innovate novel regulatory programs. Several topological properties, such as degree distribution, are known to influence the robustness and evolvability of GRNs. Assortativity, which measures the propensity of nodes of similar connectivity to connect to one another, is a separate topological property that has recently been shown to influence the robustness of GRNs to point mutations in cis-regulatory regions. However, it remains to be seen how assortativity may influence the robustness and evolvability of GRNs to other forms of genetic perturbation, such as gene birth via duplication or de novo origination. Here, we employ a computational model of genetic regulation to investigate whether the assortativity of a GRN influences its robustness and evolvability upon gene birth. We find that the robustness of a GRN generally increases with increasing assortativity, while its evolvability generally decreases. However, the rate of change in robustness outpaces that of evolvability, resulting in an increased proportion of assortative GRNs that are simultaneously robust and evolvable. By providing a mechanistic explanation for these observations, this work extends our understanding of how the assortativity of a GRN influences its robustness and evolvability upon gene birth. PMID:23542384

  8. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  9. Dose response relationship in anti-stress gene regulatory networks.

    PubMed

    Zhang, Qiang; Andersen, Melvin E

    2007-03-02

    To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misfolded proteins, and O2, is first detected by specialized sensor molecules, then the signal is transduced to specific transcription factors. Transcription factors can regulate the expression of a suite of anti-stress genes, many of which encode enzymes functioning to counteract the perturbed variables. The objective of this study was to explore, using control theory and computational approaches, the theoretical basis that underlies the steady-state dose response relationship between cellular stressors and intracellular biochemical species (controlled variables, transcription factors, and gene products) in these gene regulatory networks. Our work indicated that the shape of dose response curves (linear, superlinear, or sublinear) depends on changes in the specific values of local response coefficients (gains) distributed in the feedback loop. Multimerization of anti-stress enzymes and transcription factors into homodimers, homotrimers, or even higher-order multimers, play a significant role in maintaining robust homeostasis. Moreover, our simulation noted that dose response curves for the controlled variables can transition sequentially through four distinct phases as stressor level increases: initial superlinear with lesser control, superlinear more highly controlled, linear uncontrolled, and sublinear catastrophic. Each phase relies on specific gain-changing events that come into play as stressor level increases. The low-dose region is intrinsically nonlinear, and depending on

  10. Gene regulatory network clustering for graph layout based on microarray gene expression data.

    PubMed

    Kojima, Kaname; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

    2010-01-01

    We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.

  11. Cis- and Trans-Regulatory Mechanisms of Gene Expression in the ASJ Sensory Neuron of Caenorhabditis elegans

    PubMed Central

    González-Barrios, María; Fierro-González, Juan Carlos; Krpelanova, Eva; Mora-Lorca, José Antonio; Pedrajas, José Rafael; Peñate, Xenia; Chavez, Sebastián; Swoboda, Peter; Jansen, Gert; Miranda-Vizuete, Antonio

    2015-01-01

    The identity of a given cell type is determined by the expression of a set of genes sharing common cis-regulatory motifs and being regulated by shared transcription factors. Here, we identify cis and trans regulatory elements that drive gene expression in the bilateral sensory neuron ASJ, located in the head of the nematode Caenorhabditis elegans. For this purpose, we have dissected the promoters of the only two genes so far reported to be exclusively expressed in ASJ, trx-1 and ssu-1. We hereby identify the ASJ motif, a functional cis-regulatory bipartite promoter region composed of two individual 6 bp elements separated by a 3 bp linker. The first element is a 6 bp CG-rich sequence that presumably binds the Sp family member zinc-finger transcription factor SPTF-1. Interestingly, within the C. elegans nervous system SPTF-1 is also found to be expressed only in ASJ neurons where it regulates expression of other genes in these neurons and ASJ cell fate. The second element of the bipartite motif is a 6 bp AT-rich sequence that is predicted to potentially bind a transcription factor of the homeobox family. Together, our findings identify a specific promoter signature and SPTF-1 as a transcription factor that functions as a terminal selector gene to regulate gene expression in C. elegans ASJ sensory neurons. PMID:25769980

  12. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks

    PubMed Central

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-01-01

    The diverse, specialized genes present in today’s lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins’ binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes’ evolutionary properties. Slowly evolving (“cold”), old genes tend to interact with each other, as do rapidly evolving (“hot”), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN’s community structures and its genes’ evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  13. Extensive Evolutionary Changes in Regulatory Element Activity during Human Origins Are Associated with Altered Gene Expression and Positive Selection

    PubMed Central

    Fedrigo, Olivier; Babbitt, Courtney C.; Wortham, Matthew; Tewari, Alok K.; London, Darin; Song, Lingyun; Lee, Bum-Kyu; Iyer, Vishwanath R.; Parker, Stephen C. J.; Margulies, Elliott H.; Wray, Gregory A.; Furey, Terrence S.; Crawford, Gregory E.

    2012-01-01

    Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species. PMID:22761590

  14. Evolutionary analysis of the cis-regulatory region of the spicule matrix gene SM50 in strongylocentrotid sea urchins.

    PubMed

    Walters, Jenna; Binkley, Elaine; Haygood, Ralph; Romano, Laura A

    2008-03-15

    An evolutionary analysis of transcriptional regulation is essential to understanding the molecular basis of phenotypic diversity. The sea urchin is an ideal system in which to explore the functional consequence of variation in cis-regulatory sequences. We are particularly interested in the evolution of genes involved in the patterning and synthesis of its larval skeleton. This study focuses on the cis-regulatory region of SM50, which has already been characterized to a considerable extent in the purple sea urchin, Strongylocentrotus purpuratus. We have isolated the cis-regulatory region from 15 individuals of S. purpuratus as well as seven closely related species in the family Strongylocentrotidae. We have performed a variety of statistical tests and present evidence that the cis-regulatory elements upstream of the SM50 gene have been subject to positive selection along the lineage leading to S. purpuratus. In addition, we have performed electrophoretic mobility shift assays (EMSAs) and demonstrate that nucleotide substitutions within Element C affect the ability of nuclear proteins to bind to this cis-regulatory element among members of the family Strongylocentrotidae. We speculate that such changes in SM50 and other genes could accumulate to produce altered patterns of gene expression with functional consequences during skeleton formation.

  15. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato.

    PubMed

    Fischer, Iris; Steige, Kim A; Stephan, Wolfgang; Mboup, Mamadou

    2013-01-01

    The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives.

  16. Elements of the maize A1 promoter required for transactivation by the anthocyanin B/C1 or phlobaphene P regulatory genes.

    PubMed Central

    Tuerck, J A; Fromm, M E

    1994-01-01

    The extensive genetic and molecular characterization of the flavonoid pathway's structural and regulatory genes has provided some of the most detailed knowledge of gene interactions in plants. In maize flavonoid biosynthesis, the A1 gene is independently regulated in the anthocyanin and phlobaphene pathways. Anthocyanin production requires the expression of the C1 or PI and R or B regulatory genes, whereas phlobaphene production requires only the P regulatory gene. By deletion analysis of the A1 promoter, we show that the sequences between -123 and -88 are critical for activation by anthocyanin and phlobaphene regulatory genes. Linker-scanner mutations indicated that the -123 to -100 region is more important for transactivation by the P protein. The -98 to -88 region is more important for B/C1 transactivation and shows a strong homology with the region of the Bz1 anthocyanin structural gene promoter shown to be activated by B/C1 and not by P. We identified a 14-bp consensus sequence that is also present in the promoters of three other genes in the anthocyanin pathway, and we propose a model for how the flavonoid regulatory proteins interact with the promoters of the structural genes. PMID:7827497

  17. Stability Depends on Positive Autoregulation in Boolean Gene Regulatory Networks

    PubMed Central

    Pinho, Ricardo; Garcia, Victor; Irimia, Manuel; Feldman, Marcus W.

    2014-01-01

    Network motifs have been identified as building blocks of regulatory networks, including gene regulatory networks (GRNs). The most basic motif, autoregulation, has been associated with bistability (when positive) and with homeostasis and robustness to noise (when negative), but its general importance in network behavior is poorly understood. Moreover, how specific autoregulatory motifs are selected during evolution and how this relates to robustness is largely unknown. Here, we used a class of GRN models, Boolean networks, to investigate the relationship between autoregulation and network stability and robustness under various conditions. We ran evolutionary simulation experiments for different models of selection, including mutation and recombination. Each generation simulated the development of a population of organisms modeled by GRNs. We found that stability and robustness positively correlate with autoregulation; in all investigated scenarios, stable networks had mostly positive autoregulation. Assuming biological networks correspond to stable networks, these results suggest that biological networks should often be dominated by positive autoregulatory loops. This seems to be the case for most studied eukaryotic transcription factor networks, including those in yeast, flies and mammals. PMID:25375153

  18. Topological effects of data incompleteness of gene regulatory networks

    PubMed Central

    2012-01-01

    Background The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. Results In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different levels. Conclusions In doing so, we identify the most relevant factors affecting the validity of previous findings, highlighting important caveats to future prokaryotic TRNs topological analysis. PMID:22920968

  19. Complex Dynamic Behavior in Simple Gene Regulatory Networks

    NASA Astrophysics Data System (ADS)

    Santillán Zerón, Moisés

    2007-02-01

    Knowing the complete genome of a given species is just a piece of the puzzle. To fully unveil the systems behavior of an organism, an organ, or even a single cell, we need to understand the underlying gene regulatory dynamics. Given the complexity of the whole system, the ultimate goal is unattainable for the moment. But perhaps, by analyzing the most simple genetic systems, we may be able to develop the mathematical techniques and procedures required to tackle more complex genetic networks in the near future. In the present work, the techniques for developing mathematical models of simple bacterial gene networks, like the tryptophan and lactose operons are introduced. Despite all of the underlying assumptions, such models can provide valuable information regarding gene regulation dynamics. Here, we pay special attention to robustness as an emergent property. These notes are organized as follows. In the first section, the long historical relation between mathematics, physics, and biology is briefly reviewed. Recently, the multidisciplinary work in biology has received great attention in the form of systems biology. The main concepts of this novel science are discussed in the second section. A very slim introduction to the essential concepts of molecular biology is given in the third section. In the fourth section, a brief introduction to chemical kinetics is presented. Finally, in the fifth section, a mathematical model for the lactose operon is developed and analyzed..

  20. Stochastic S-system modeling of gene regulatory network.

    PubMed

    Chowdhury, Ahsan Raja; Chetty, Madhu; Evans, Rob

    2015-10-01

    Microarray gene expression data can provide insights into biological processes at a system-wide level and is commonly used for reverse engineering gene regulatory networks (GRN). Due to the amalgamation of noise from different sources, microarray expression profiles become inherently noisy leading to significant impact on the GRN reconstruction process. Microarray replicates (both biological and technical), generated to increase the reliability of data obtained under noisy conditions, have limited influence in enhancing the accuracy of reconstruction . Therefore, instead of the conventional GRN modeling approaches which are deterministic, stochastic techniques are becoming increasingly necessary for inferring GRN from noisy microarray data. In this paper, we propose a new stochastic GRN model by investigating incorporation of various standard noise measurements in the deterministic S-system model. Experimental evaluations performed for varying sizes of synthetic network, representing different stochastic processes, demonstrate the effect of noise on the accuracy of genetic network modeling and the significance of stochastic modeling for GRN reconstruction . The proposed stochastic model is subsequently applied to infer the regulations among genes in two real life networks: (1) the well-studied IRMA network, a real-life in-vivo synthetic network constructed within the Saccharomyces cerevisiae yeast, and (2) the SOS DNA repair network in Escherichia coli.

  1. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  2. Transactivation of anthocyanin biosynthetic genes following transfer of B regulatory genes into maize tissues.

    PubMed Central

    Goff, S A; Klein, T M; Roth, B A; Fromm, M E; Cone, K C; Radicella, J P; Chandler, V L

    1990-01-01

    The C1, B and R genes regulating the maize anthocyanin biosynthetic pathway encode tissue-specific regulatory proteins with similarities to transcriptional activators. The C1 and R regulatory genes are usually responsible for pigmentation of seed tissues, and the B-Peru allele of B, but not the B-I allele, can substitute for R function in the seed. In this study, members of the B family of regulatory genes were delivered to intact maize tissues by high velocity microprojectiles. In colorless r aleurones or embryos, the introduction of the B-Peru genomic clone or the expressed cDNAs of B-Peru or B-I resulted in anthocyanin-producing cells. Luciferase produced from the Bronze1 anthocyanin structural gene promoter was induced 100-fold when co-introduced with the expressed B-Peru or B-I cDNAs. This quantitative transactivation assay demonstrates that the proteins encoded by these two B alleles are equally able to transactivate the Bronze1 promoter. Analogous results were obtained using embryogenic callus cells. These observations suggest that one major contribution towards tissue-specific anthocyanin synthesis controlled by the various alleles of the B and R genes is the differential expression of functionally similar proteins. Images Fig. 2. PMID:2369901

  3. Reverse engineering of gene regulatory network using restricted gene expression programming.

    PubMed

    Yang, Bin; Liu, Sanrong; Zhang, Wei

    2016-10-01

    Inference of gene regulatory networks has been becoming a major area of interest in the field of systems biology over the past decade. In this paper, we present a novel representation of S-system model, named restricted gene expression programming (RGEP), to infer gene regulatory network. A new hybrid evolutionary algorithm based on structure-based evolutionary algorithm and cuckoo search (CS) is proposed to optimize the architecture and corresponding parameters of model, respectively. Two synthetic benchmark datasets and one real biological dataset from SOS DNA repair network in E. coli are used to test the validity of our method. Experimental results demonstrate that our proposed method performs better than previously proposed popular methods.

  4. Algebraic model checking for Boolean gene regulatory networks.

    PubMed

    Tran, Quoc-Nam

    2011-01-01

    We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.

  5. Modeling gene regulatory networks: A network simplification algorithm

    NASA Astrophysics Data System (ADS)

    Ferreira, Luiz Henrique O.; de Castro, Maria Clicia S.; da Silva, Fabricio A. B.

    2016-12-01

    Boolean networks have been used for some time to model Gene Regulatory Networks (GRNs), which describe cell functions. Those models can help biologists to make predictions, prognosis and even specialized treatment when some disturb on the GRN lead to a sick condition. However, the amount of information related to a GRN can be huge, making the task of inferring its boolean network representation quite a challenge. The method shown here takes into account information about the interactome to build a network, where each node represents a protein, and uses the entropy of each node as a key to reduce the size of the network, allowing the further inferring process to focus only on the main protein hubs, the ones with most potential to interfere in overall network behavior.

  6. Presence of STA gene sequences in brewer's yeast genome.

    PubMed

    Balogh, I; Maráz, A

    1996-06-01

    STA genes are responsible for producing extracellular glucoamylase enzymes in Saccharomyces cerevisiae var. diastaticus. These genes exist in three forms, which are located on three different chromosomes. The nucleotide sequences of the STA genes are highly homologous. A sporulation-specific glucoamylase gene called SGA1 exists in every Saccharomyces cerevisiae strain, this also having a partly homologous DNA sequence with the STA genes. In this study S. cerevisiae var. diastaticus and brewer's yeast strains were characterized by pulsed-field gel electrophoresis. In many cases chromosome length polymorphism (CLP) was found. The chromosomes were hybridized with a DNA probe which was homologous with STA genes and the SGA1 gene. Presence of the SGA1 gene was detected in each strain used. Four brewing yeasts were found to have homologous sequences with the STA3 gene on chromosome XIV despite the fact that these strains were not able to produce extracellular glucoamylase enzyme.

  7. Dynamical analysis of regulatory interactions in the gap gene system of Drosophila melanogaster.

    PubMed Central

    Jaeger, Johannes; Blagov, Maxim; Kosman, David; Kozlov, Konstantin N; Manu; Myasnikova, Ekaterina; Surkova, Svetlana; Vanario-Alonso, Carlos E; Samsonova, Maria; Sharp, David H; Reinitz, John

    2004-01-01

    Genetic studies have revealed that segment determination in Drosophila melanogaster is based on hierarchical regulatory interactions among maternal coordinate and zygotic segmentation genes. The gap gene system constitutes the most upstream zygotic layer of this regulatory hierarchy, responsible for the initial interpretation of positional information encoded by maternal gradients. We present a detailed analysis of regulatory interactions involved in gap gene regulation based on gap gene circuits, which are mathematical gene network models used to infer regulatory interactions from quantitative gene expression data. Our models reproduce gap gene expression at high accuracy and temporal resolution. Regulatory interactions found in gap gene circuits provide consistent and sufficient mechanisms for gap gene expression, which largely agree with mechanisms previously inferred from qualitative studies of mutant gene expression patterns. Our models predict activation of Kr by Cad and clarify several other regulatory interactions. Our analysis suggests a central role for repressive feedback loops between complementary gap genes. We observe that repressive interactions among overlapping gap genes show anteroposterior asymmetry with posterior dominance. Finally, our models suggest a correlation between timing of gap domain boundary formation and regulatory contributions from the terminal maternal system. PMID:15342511

  8. Integrated module and gene-specific regulatory inference implicates upstream signaling networks.

    PubMed

    Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A; Stewart, Ron; Gasch, Audrey P

    2013-01-01

    Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development.

  9. Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks

    PubMed Central

    Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.

    2013-01-01

    Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602

  10. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    PubMed

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection.

  11. Unraveling gene regulatory networks from time-resolved gene expression data -- a measures comparison study

    PubMed Central

    2011-01-01

    Background Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. Results Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. Conclusions Our

  12. Genetic regulatory signatures underlying islet gene expression and type 2 diabetes

    PubMed Central

    Varshney, Arushi; Scott, Laura J.; Welch, Ryan P.; Erdos, Michael R.; Chines, Peter S.; Narisu, Narisu; Albanus, Ricardo D’O.; Orchard, Peter; Wolford, Brooke N.; Kursawe, Romy; Vadlamudi, Swarooparani; Cannon, Maren E.; Didion, John P.; Hensley, John; Kirilusha, Anthony; Bonnycastle, Lori L.; Taylor, D. Leland; Watanabe, Richard; Mohlke, Karen L.; Boehnke, Michael; Collins, Francis S.; Parker, Stephen C. J.; Stitzel, Michael L.

    2017-01-01

    Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition. PMID:28193859

  13. Genetic regulatory signatures underlying islet gene expression and type 2 diabetes.

    PubMed

    Varshney, Arushi; Scott, Laura J; Welch, Ryan P; Erdos, Michael R; Chines, Peter S; Narisu, Narisu; Albanus, Ricardo D'O; Orchard, Peter; Wolford, Brooke N; Kursawe, Romy; Vadlamudi, Swarooparani; Cannon, Maren E; Didion, John P; Hensley, John; Kirilusha, Anthony; Bonnycastle, Lori L; Taylor, D Leland; Watanabe, Richard; Mohlke, Karen L; Boehnke, Michael; Collins, Francis S; Parker, Stephen C J; Stitzel, Michael L

    2017-02-28

    Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.

  14. Regulatory component analysis: a semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge

    PubMed Central

    Wang, Chen; Xuan, Jianhua; Shih, Ie-Ming; Clarke, Robert; Wang, Yue

    2011-01-01

    With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise-ratio (SNR) is low, but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on E. coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm. PMID:22685363

  15. Characterization of the human glycogenin-1 gene: identification of a muscle-specific regulatory domain.

    PubMed

    van Maanen, M H; Fournier, P A; Palmer, T N; Abraham, L J

    1999-07-08

    The de-novo synthesis of glycogen is now known to involve a novel class of self-glucosylating protein primers. In mammalian skeletal muscle, glycogenin-1 is the protein responsible for this initiation step. Northern blot analysis revealed that glycogenin-1 gene transcription is differentially regulated in the C2C12 mouse muscle cell line. To define the regulatory elements that control expression of the glycogenin-1 gene, we have cloned and characterized the genomic structure of the human glycogenin-1 gene and its promoter region. This gene consists of seven exons and six introns, and spans over 13kb. Transcription of human glycogenin-1 is initiated at two major sites, 80 and 86bp upstream from the initiation of translation codon. Nucleotide sequence analysis of 2.1kb of the 5'-flanking region revealed the proximal promoter contains both a TATA box and two putative Sp1 binding sites located in a CpG island. There are numerous binding sites for developmental and cell-type-specific transcription factors, including AP-1, AP-2, GATA, and several potential Oct 1 binding domains. There are also nine consensus E-boxes that bind the basic helix-loop-helix family of muscle-specific transcription factors. The transcriptional activity of the glycogenin-1 gene was investigated by transient transfection of the 5'-flanking region in HepG2 cells and C2C12 myoblasts and myotubes. These results permitted the definition of a minimal 232bp promoter fragment that is responsible for basal level transcription in a cell-type-independent manner. Furthermore, we have identified a regulatory region located between -2076 and -1736 of the 5'-flanking region of the human glycogenin-1 gene that allows myotube-specific expression in C2C12 cells.

  16. BCIP: a gene-centered platform for identifying potential regulatory genes in breast cancer

    PubMed Central

    Wu, Jiaqi; Hu, Shuofeng; Chen, Yaowen; Li, Zongcheng; Zhang, Jian; Yuan, Hanyu; Shi, Qiang; Shao, Ningsheng; Ying, Xiaomin

    2017-01-01

    Breast cancer is a disease with high heterogeneity. Many issues on tumorigenesis and progression are still elusive. It is critical to identify genes that play important roles in the progression of tumors, especially for tumors with poor prognosis such as basal-like breast cancer and tumors in very young women. To facilitate the identification of potential regulatory or driver genes, we present the Breast Cancer Integrative Platform (BCIP, http://omics.bmi.ac.cn/bcancer/). BCIP maintains multi-omics data selected with strict quality control and processed with uniform normalization methods, including gene expression profiles from 9,005 tumor and 376 normal tissue samples, copy number variation information from 3,035 tumor samples, microRNA-target interactions, co-expressed genes, KEGG pathways, and mammary tissue-specific gene functional networks. This platform provides a user-friendly interface integrating comprehensive and flexible analysis tools on differential gene expression, copy number variation, and survival analysis. The prominent characteristic of BCIP is that users can perform analysis by customizing subgroups with single or combined clinical features, including subtypes, histological grades, pathologic stages, metastasis status, lymph node status, ER/PR/HER2 status, TP53 mutation status, menopause status, age, tumor size, therapy responses, and prognosis. BCIP will help to identify regulatory or driver genes and candidate biomarkers for further research in breast cancer. PMID:28327601

  17. Nucleotide sequence of the coat protein gene of canine parvovirus.

    PubMed Central

    Rhode, S L

    1985-01-01

    The nucleotide sequence of the canine parvovirus (CPV2) from map units 33 to 95 has been determined. This includes the entire coat protein gene and noncoding sequences at the 3' end of the gene, exclusive of the terminal inverted repeat. The predicted capsid protein structures are discussed and compared with those of the rodent parvoviruses H-1 and MVM. PMID:3989914

  18. Identification of single nucleotide polymorphism in protein phosphatase 1 regulatory subunit 11 gene in Murrah bulls

    PubMed Central

    Jain, Varsha; Patel, Brijesh; Umar, Farhat Paul; Ajithakumar, H. M.; Gurjar, Suraj K.; Gupta, I. D.; Verma, Archana

    2017-01-01

    Aim: This study was conducted with the objective to identify single nucleotide polymorphism (SNP) in protein phosphatase 1 regulatory subunit 11 (PPP1R11) gene in Murrah bulls. Materials and Methods: Genomic DNA was isolated by phenol–chloroform extraction method from the frozen semen samples of 65 Murrah bulls maintained at Artificial Breeding Research Centre, ICAR-National Dairy Research Institute, Karnal. The quality and concentration of DNA was checked by spectrophotometer reading and agarose gel electrophoresis. The target region of PPP1R11 gene was amplified using four sets of primer designed based on Bos taurus reference sequence. The amplified products were sequenced and aligned using Clustal Omega for identification of SNPs. Animals were genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) using EcoNI restriction enzyme. Results: The sequences in the NCBI accession number NW_005785016.1 for Bubalus bubalis were compared and aligned with the edited sequences of Murrah bulls with Clustal Omega software. A total of 10 SNPs were found, out of which 1 at 5’UTR, 3 at intron 1, and 6 at intron 2 region. PCR-RFLP using restriction enzyme EcoNI revealed only AA genotype indicating monomorphism in PPP1R11 gene of all Murrah animals included in the study. Conclusion: A total of 10 SNPs were found. PCR-RFLP revealed only AA genotype indicating monomorphism in PPP1R11 gene of all Murrah animals included in the study, due to which association analysis with conception rate was not feasible. PMID:28344410

  19. Coordinated regulation of biosynthetic and regulatory genes coincides with anthocyanin accumulation in developing eggplant fruit

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Violet to black pigmentation of eggplant (Solanum melongena) fruit is attributed to anthocyanin accumulation. Model systems support the interaction of biosynthetic and regulatory genes for anthocyanin biosynthesis. Anthocyanin structural gene transcription requires the expression of at least one m...

  20. Segregation of cardiac and skeletal muscle-specific regulatory elements of the beta-myosin heavy chain gene.

    PubMed Central

    Rindt, H; Knotts, S; Robbins, J

    1995-01-01

    The beta-myosin heavy chain (beta-MyHC) gene is expressed in cardiac and slow skeletal muscles. To examine the regulatory sequences that are required for the gene's expression in the two compartments in vivo, we analyzed the expression pattern of a transgene consisting of the beta-MyHC gene 5' upstream region linked to the chloramphenicol acetyltransferase reporter gene. By using 5600 bp of 5' upstream region, the transgene was expressed at high levels in the slow skeletal muscles. Decreased levels of thyroid hormone led to the up-regulation of the transgene in both cardiac and skeletal muscles, mimicking the behavior of the endogenous beta-MyHC gene. After deleting the distal 5000 bp, the level of reporter gene expression was strongly reduced. However, decreased levels of thyroid hormone led to an 80-fold skeletal muscle-specific increase in transgene expression, even upon the ablation of a conserved cis-regulatory element termed MCAT, which under normal (euthyroid) conditions abolishes muscle-specific expression. In contrast, cardiac-specific induction was not detected with the deletion construct. These observations indicate that the cardiac and skeletal muscle regulatory elements can be functionally segregated on the beta-MyHC gene promoter. Images Fig. 2 Fig. 3 Fig. 4 Fig. 5 PMID:7878016

  1. Segregation of cardiac and skeletal muscle-specific regulatory elements of the beta-myosin heavy chain gene.

    PubMed

    Rindt, H; Knotts, S; Robbins, J

    1995-02-28

    The beta-myosin heavy chain (beta-MyHC) gene is expressed in cardiac and slow skeletal muscles. To examine the regulatory sequences that are required for the gene's expression in the two compartments in vivo, we analyzed the expression pattern of a transgene consisting of the beta-MyHC gene 5' upstream region linked to the chloramphenicol acetyltransferase reporter gene. By using 5600 bp of 5' upstream region, the transgene was expressed at high levels in the slow skeletal muscles. Decreased levels of thyroid hormone led to the up-regulation of the transgene in both cardiac and skeletal muscles, mimicking the behavior of the endogenous beta-MyHC gene. After deleting the distal 5000 bp, the level of reporter gene expression was strongly reduced. However, decreased levels of thyroid hormone led to an 80-fold skeletal muscle-specific increase in transgene expression, even upon the ablation of a conserved cis-regulatory element termed MCAT, which under normal (euthyroid) conditions abolishes muscle-specific expression. In contrast, cardiac-specific induction was not detected with the deletion construct. These observations indicate that the cardiac and skeletal muscle regulatory elements can be functionally segregated on the beta-MyHC gene promoter.

  2. A regulatory gene network related to the porcine umami taste receptor (TAS1R1/TAS1R3).

    PubMed

    Kim, J M; Ren, D; Reverter, A; Roura, E

    2016-02-01

    Taste perception plays an important role in the mediation of food choices in mammals. The first porcine taste receptor genes identified, sequenced and characterized, TAS1R1 and TAS1R3, were related to the dimeric receptor for umami taste. However, little is known about their regulatory network. The objective of this study was to unfold the genetic network involved in porcine umami taste perception. We performed a meta-analysis of 20 gene expression studies spanning 480 porcine microarray chips and screened 328 taste-related genes by selective mining steps among the available 12,320 genes. A porcine umami taste-specific regulatory network was constructed based on the normalized coexpression data of the 328 genes across 27 tissues. From the network, we revealed the 'taste module' and identified a coexpression cluster for the umami taste according to the first connector with the TAS1R1/TAS1R3 genes. Our findings identify several taste-related regulatory genes and extend previous genetic background of porcine umami taste.

  3. Stochastic models and numerical algorithms for a class of regulatory gene networks.

    PubMed

    Fournier, Thomas; Gabriel, Jean-Pierre; Pasquier, Jerôme; Mazza, Christian; Galbete, José; Mermod, Nicolas

    2009-08-01

    Regulatory gene networks contain generic modules, like those involving feedback loops, which are essential for the regulation of many biological functions (Guido et al. in Nature 439:856-860, 2006). We consider a class of self-regulated genes which are the building blocks of many regulatory gene networks, and study the steady-state distribution of the associated Gillespie algorithm by providing efficient numerical algorithms. We also study a regulatory gene network of interest in gene therapy, using mean-field models with time delays. Convergence of the related time-nonhomogeneous Markov chain is established for a class of linear catalytic networks with feedback loops.

  4. A Bayesian Approach to Joint Modeling of Protein-DNA Binding, Gene Expression and Sequence Data

    PubMed Central

    Xie, Yang; Pan, Wei; Jeong, Kyeong S.; Xiao, Guanghua; Khodursky, Arkady B.

    2012-01-01

    The genome-wide DNA-protein binding data, DNA sequence data and gene expression data represent complementary means to deciphering global and local transcriptional regulatory circuits. Combining these different types of data can not only improve the statistical power, but also provide a more comprehensive picture of gene regulation. In this paper, we propose a novel statistical model to augment proteinDNA binding data with gene expression and DNA sequence data when available. We specify a hierarchical Bayes model and use Markov chain Monte Carlo simulations to draw inferences. Both simulation studies and an analysis of an experimental dataset show that the proposed joint modeling method can significantly improve the specificity and sensitivity of identifying target genes as compared to conventional approaches relying on a single data source. PMID:20049751

  5. The impact of gene expression variation on the robustness and evolvability of a developmental gene regulatory network.

    PubMed

    Garfield, David A; Runcie, Daniel E; Babbitt, Courtney C; Haygood, Ralph; Nielsen, William J; Wray, Gregory A

    2013-10-01

    Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear), allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.

  6. Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers.

    PubMed

    Goren, Amir; Ram, Oren; Amit, Maayan; Keren, Hadas; Lev-Maor, Galit; Vig, Ida; Pupko, Tal; Ast, Gil

    2006-06-23

    Exonic splicing regulatory sequences (ESRs) are cis-acting factor binding sites that regulate constitutive and alternative splicing. A computational method based on the conservation level of wobble positions and the overabundance of sequence motifs between 46,103 human and mouse orthologous exons was developed, identifying 285 putative ESRs. Alternatively spliced exons that are either short in length or contain weak splice sites show the highest conservation level of those ESRs, especially toward the edges of exons. ESRs that are abundant in those subgroups show a different distribution between constitutively and alternatively spliced exons. Representatives of these ESRs and two SR protein binding sites were shown, experimentally, to display variable regulatory effects on alternative splicing, depending on their relative locations in the exon. This finding signifies the delicate positional effect of ESRs on alternative splicing regulation.

  7. Improving gene regulatory network inference using network topology information.

    PubMed

    Nair, Ajay; Chetty, Madhu; Wangikar, Pramod P

    2015-09-01

    Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.

  8. An algebra-based method for inferring gene regulatory networks

    PubMed Central

    2014-01-01

    Background The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. Results This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also

  9. Identification of a lens-specific regulatory region (LSR) of the murine alpha B-crystallin gene.

    PubMed

    Gopal-Srivastava, R; Piatigorsky, J

    1994-04-11

    Previous studies have shown that the -661/+44 sequence of the murine alpha B-crystallin gene contains a muscle-preferred enhancer (-426/-257) and can drive the bacterial chloramphenicol acetyltransferase (CAT) gene in the lens, skeletal muscle and heart of transgenic mice. Here we show that transgenic mice carrying a truncated -164/+44 fragment of the alpha B-crystallin gene fused to the CAT gene expressed exclusively in the lens; by contrast mice carrying a -426/+44 fragment of the alpha B gene fused to CAT expressed highly in the lens, skeletal muscle and heart, and slightly in the lung, brain, kidney, spleen and liver. DNase I protection experiments indicated that the -147/-118 sequence is protected by nuclear proteins from alpha TN4-1 lens cell line, but not by nuclear proteins from myotubes of the C2C12 cell line. Site directed mutagenesis of this sequence decreased promoter activity in transiently-transfected lens cells, consistent with this sequence being a lens-specific regulatory region (LSR). We conclude that the -426/-257 enhancer is required for expression in skeletal muscle, heart and possibly other tissues, and that the -164/+44 sequence of the alpha B-crystallin gene is sufficient for expression in the lens of transgenic mice.

  10. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

    PubMed

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser.

  11. ZNF143 provides sequence specificity to secure chromatin interactions at gene promoters

    PubMed Central

    Bailey, Swneke D.; Zhang, Xiaoyang; Desai, Kinjal; Aid, Malika; Corradin, Olivia; Cowper-Sal·lari, Richard; Akhtar-Zaidi, Batool; Scacheri, Peter C.; Haibe-Kains, Benjamin; Lupien, Mathieu

    2015-01-01

    Chromatin interactions connect distal regulatory elements to target gene promoters guiding stimulus- and lineage-specific transcription. Few factors securing chromatin interactions have so far been identified. Here by integrating chromatin interaction maps with the large collection of transcription factor binding profiles provided by the ENCODE project, we demonstrate that the zinc-finger protein ZNF143 preferentially occupies anchors of chromatin interactions connecting promoters with distal regulatory elements. It binds directly to promoters and associates with lineage-specific chromatin interactions and gene expression. Silencing ZNF143 or modulating its DNA-binding affinity using single nucleotide polymorphisms (SNPs) as a surrogate of site-directed mutagenesis reveals the sequence dependency of chromatin interactions at gene promoters. We also find that chromatin interactions alone do not regulate gene expression. Together, our results identify ZNF143 as a novel chromatin-looping factor that contributes to the architectural foundation of the genome by providing sequence specificity at promoters connected with distal regulatory elements. PMID:25645053

  12. Sequence determinants of prokaryotic gene expression level under heat stress.

    PubMed

    Xiong, Heng; Yang, Yi; Hu, Xiao-Pan; He, Yi-Ming; Ma, Bin-Guang

    2014-11-01

    Prokaryotic gene expression is environment-dependent and temperature plays an important role in shaping the gene expression profile. Revealing the regulation mechanisms of gene expression pertaining to temperature has attracted tremendous efforts in recent years particularly owning to the yielding of transcriptome and proteome data by high-throughput techniques. However, most of the previous works concentrated on the characterization of the gene expression profile of individual organism and little effort has been made to disclose the commonality among organisms, especially for the gene sequence features. In this report, we collected the transcriptome and proteome data measured under heat stress condition from recently published literature and studied the sequence determinants for the expression level of heat-responsive genes on multiple layers. Our results showed that there indeed exist commonness and consistent patterns of the sequence features among organisms for the differentially expressed genes under heat stress condition. Some features are attributed to the requirement of thermostability while some are dominated by gene function. The revealed sequence determinants of bacterial gene expression level under heat stress complement the knowledge about the regulation factors of prokaryotic gene expression responding to the change of environmental conditions. Furthermore, comparisons to thermophilic adaption have been performed to reveal the similarity and dissimilarity of the sequence determinants for the response to heat stress and for the adaption to high habitat temperature, which elucidates the complex landscape of gene expression related to the same physical factor of temperature.

  13. Searching gene and protein sequence databases.

    PubMed

    Barsalou, T; Brutlag, D L

    1991-01-01

    A large-scale effort to map and sequence the human genome is now under way. Crucial to the success of this research is a group of computer programs that analyze and compare data on molecular sequences. This article describes the classic algorithms for similarity searching and sequence alignment. Because good performance of these algorithms is critical to searching very large and growing databases, we analyze the running times of the algorithms and discuss recent improvements in this area.

  14. Inferring orthologous gene regulatory networks using interspecies data fusion

    PubMed Central

    Penfold, Christopher A.; Millar, Jonathan B. A.; Wild, David L.

    2015-01-01

    Motivation: The ability to jointly learn gene regulatory networks (GRNs) in, or leverage GRNs between related species would allow the vast amount of legacy data obtained in model organisms to inform the GRNs of more complex, or economically or medically relevant counterparts. Examples include transferring information from Arabidopsis thaliana into related crop species for food security purposes, or from mice into humans for medical applications. Here we develop two related Bayesian approaches to network inference that allow GRNs to be jointly inferred in, or leveraged between, several related species: in one framework, network information is directly propagated between species; in the second hierarchical approach, network information is propagated via an unobserved ‘hypernetwork’. In both frameworks, information about network similarity is captured via graph kernels, with the networks additionally informed by species-specific time series gene expression data, when available, using Gaussian processes to model the dynamics of gene expression. Results: Results on in silico benchmarks demonstrate that joint inference, and leveraging of known networks between species, offers better accuracy than standalone inference. The direct propagation of network information via the non-hierarchical framework is more appropriate when there are relatively few species, while the hierarchical approach is better suited when there are many species. Both methods are robust to small amounts of mislabelling of orthologues. Finally, the use of Saccharomyces cerevisiae data and networks to inform inference of networks in the budding yeast Schizosaccharomyces pombe predicts a novel role in cell cycle regulation for Gas1 (SPAC19B12.02c), a 1,3-beta-glucanosyltransferase. Availability and implementation: MATLAB code is available from http://go.warwick.ac.uk/systemsbiology/software/. Contact: d.l.wild@warwick.ac.uk Supplementary information: Supplementary data are available at Bioinformatics

  15. A new approach for modelling gene regulatory networks using fuzzy petri nets.

    PubMed

    Hamed, Raed I; Ahson, S I; Parveen, R

    2010-02-04

    Gene Regulatory Networks are models of genes and gene interactions at the expression level. The advent of microarray technology has challenged computer scientists to develop better algorithms for modeling the underlying regulatory relationship in between the genes. Fuzzy system has an ability to search microarray datasets for activator/repressor regulatory relationship. In this paper, we present a fuzzy reasoning model based on the Fuzzy Petri Net. The model considers the regulatory triplets by means of predicting changes in expression level of the target based on input expression level. This method eliminates possible false predictions from the classical fuzzy model thereby allowing a wider search space for inferring regulatory relationship. Through formalization of fuzzy reasoning, we propose an approach to construct a rulebased reasoning system. The experimental results show the proposed approach is feasible and acceptable to predict changes in expression level of the target gene.

  16. Regulatory region in choline acetyltransferase gene directs developmental and tissue-specific expression in transgenic mice.

    PubMed Central

    Lönnerberg, P; Lendahl, U; Funakoshi, H; Arhlund-Richter, L; Persson, H; Ibáñez, C F

    1995-01-01

    Acetylcholine, one of the main neurotransmitters in the nervous system, is synthesized by the enzyme choline acetyltransferase (ChAT; acetyl-CoA:choline O-acetyltransferase, EC 2.3.1.6). The molecular mechanisms controlling the establishment, maintenance, and plasticity of the cholinergic phenotype in vivo are largely unknown. A previous report showed that a 3800-bp, but not a 1450-bp, 5' flanking segment from the rat ChAT gene promoter directed cell type-specific expression of a reporter gene in cholinergic cells in vitro. Now we have characterized a distal regulatory region of the ChAT gene that confers cholinergic specificity on a heterologous downstream promoter in a cholinergic cell line and in transgenic mice. A 2342-bp segment from the 5' flanking region of the ChAT gene behaved as an enhancer in cholinergic cells but as a repressor in noncholinergic cells in an orientation-independent manner. Combined with a heterologous basal promoter, this fragment targeted transgene expression to several cholinergic regions of the central nervous system of transgenic mice, including basal forebrain, cortex, pons, and spinal cord. In eight independent transgenic lines, the pattern of transgene expression paralleled qualitatively and quantitatively that displayed by endogenous ChAT mRNA in various regions of the rat central nervous system. In the lumbar enlargement of the spinal cord, 85-90% of the transgene expression was targeted to the ventral part of the cord, where cholinergic alpha-motor neurons are located. Transgene expression in the spinal cord was developmentally regulated and responded to nerve injury in a similar way as the endogenous ChAT gene, indicating that the 2342-bp regulatory sequence contains elements controlling the plasticity of the cholinergic phenotype in developing and injured neurons. Images Fig. 1 Fig. 2 PMID:7732028

  17. Overproduction of lactimidomycin by cross-overexpression of genes encoding Streptomyces antibiotic regulatory proteins.

    PubMed

    Zhang, Bo; Yang, Dong; Yan, Yijun; Pan, Guohui; Xiang, Wensheng; Shen, Ben

    2016-03-01

    The glutarimide-containing polyketides represent a fascinating class of natural products that exhibit a multitude of biological activities. We have recently cloned and sequenced the biosynthetic gene clusters for three members of the glutarimide-containing polyketides-iso-migrastatin (iso-MGS) from Streptomyces platensis NRRL 18993, lactimidomycin (LTM) from Streptomyces amphibiosporus ATCC 53964, and cycloheximide (CHX) from Streptomyces sp. YIM56141. Comparative analysis of the three clusters identified mgsA and chxA, from the mgs and chx gene clusters, respectively, that were predicted to encode the PimR-like Streptomyces antibiotic regulatory proteins (SARPs) but failed to reveal any regulatory gene from the ltm gene cluster. Overexpression of mgsA or chxA in S. platensis NRRL 18993, Streptomyces sp. YIM56141 or SB11024, and a recombinant strain of Streptomyces coelicolor M145 carrying the intact mgs gene cluster has no significant effect on iso-MGS or CHX production, suggesting that MgsA or ChxA regulation may not be rate-limiting for iso-MGS and CHX production in these producers. In contrast, overexpression of mgsA or chxA in S. amphibiosporus ATCC 53964 resulted in a significant increase in LTM production, with LTM titer reaching 106 mg/L, which is five-fold higher than that of the wild-type strain. These results support MgsA and ChxA as members of the SARP family of positive regulators for the iso-MGS and CHX biosynthetic machinery and demonstrate the feasibility to improve glutarimide-containing polyketide production in Streptomyces strains by exploiting common regulators.

  18. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  19. Optimization of gene sequences under constant mutational pressure and selection

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Gierlik, A.; Mackiewicz, P.; Cebrat, S.; Dudek, M. R.

    1999-12-01

    We have analyzed the influence of constant mutational pressure and selection on the nucleotide composition of DNA sequences of various size, which were represented by the genes of the Borrelia burgdorferi genome. With the help of MC simulations we have found that longer DNA sequences accumulate much less base substitutions per sequence length than short sequences. This leads us to the conclusion that the accuracy of replication may determine the size of genome.

  20. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring.

  1. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse

    PubMed Central

    Wood, Emily J.; Chin-Inmanu, Kwanrutai; Jia, Hui; Lipovich, Leonard

    2013-01-01

    Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense (SAS) gene pairs, is capable of effecting regulatory cascades through established mechanisms. We present an evolutionary conservation assessment of SAS pairs, on three levels: genomic, transcriptomic, and structural. From a genome-wide dataset of human SAS pairs, we first identified orthologous loci in the mouse genome, then assessed their transcription in the mouse, and finally compared the genomic structures of SAS pairs expressed in both species. We found that approximately half of human SAS loci have single orthologous locations in the mouse genome; however, only half of those orthologous locations have SAS transcriptional activity in the mouse. This suggests that high human-mouse gene conservation overlooks widespread distinctions in SAS pair incidence and expression. We compared gene structures at orthologous SAS loci, finding frequent differences in gene structure between human and orthologous mouse SAS pair members. Our categorization of human SAS pairs with respect to mouse conservation of expression as well as structure points to limitations of mouse models. Gene structure differences, including at SAS loci, may account for some of the phenotypic distinctions between primates and rodents. Genes in non-conserved SAS pairs may contribute to evolutionary lineage-specific regulatory outcomes. PMID:24133500

  2. Applying Attractor Dynamics to Infer Gene Regulatory Interactions Involved in Cellular Differentiation.

    PubMed

    Ghaffarizadeh, Ahmadreza; Podgorski, Gregory J; Flann, Nicholas S

    2017-02-27

    The dynamics of gene regulatory networks (GRNs) guide cellular differentiation. Determining the ways regulatory genes control expression of their targets is essential to understand and control cellular differentiation. The way a regulatory gene controls its target can be expressed as a gene regulatory function. Manual derivation of these regulatory functions is slow, error-prone and difficult to update as new information arises. Automating this process is a significant challenge and the subject of intensive effort. This work presents a novel approach to discovering biologically plausible gene regulatory interactions that control cellular differentiation. This method integrates known cell type expression data, genetic interactions, and knowledge of the effects of gene knockouts to determine likely GRN regulatory functions. We employ a genetic algorithm to search for candidate GRNs that use a set of transcription factors that control differentiation within a lineage. Nested canalyzing functions are used to constrain the search space to biologically plausible networks. The method identifies an ensemble of GRNs whose dynamics reproduce the gene expression pattern for each cell type within a particular lineage. The method's effectiveness was tested by inferring consensus GRNs for myeloid and pancreatic cell differentiation and comparing the predicted gene regulatory interactions to manually derived interactions. We identified many regulatory interactions reported in the literature and also found differences from published reports. These discrepancies suggest areas for biological studies of myeloid and pancreatic differentiation. We also performed a study that used defined synthetic networks to evaluate the accuracy of the automated search method and found that the search algorithm was able to discover the regulatory interactions in these defined networks with high accuracy. We suggest that the GRN functions derived from the methods described here can be used to fill

  3. Gene regulatory networks and their applications: understanding biological and medical problems in terms of networks

    PubMed Central

    Emmert-Streib, Frank; Dehmer, Matthias; Haibe-Kains, Benjamin

    2014-01-01

    In recent years gene regulatory networks (GRNs) have attracted a lot of interest and many methods have been introduced for their statistical inference from gene expression data. However, despite their popularity, GRNs are widely misunderstood. For this reason, we provide in this paper a general discussion and perspective of gene regulatory networks. Specifically, we discuss their meaning, the consistency among different network inference methods, ensemble methods, the assessment of GRNs, the estimated number of existing GRNs and their usage in different application domains. Furthermore, we discuss open questions and necessary steps in order to utilize gene regulatory networks in a clinical context and for personalized medicine. PMID:25364745

  4. Clinical characteristics and prognosis of acute myeloid leukemia associated with DNA-methylation regulatory gene mutations

    PubMed Central

    Ryotokuji, Takeshi; Yamaguchi, Hiroki; Ueki, Toshimitsu; Usuki, Kensuke; Kurosawa, Saiko; Kobayashi, Yutaka; Kawata, Eri; Tajika, Kenji; Gomi, Seiji; Kanda, Junya; Kobayashi, Anna; Omori, Ikuko; Marumo, Atsushi; Fujiwara, Yusuke; Yui, Shunsuke; Terada, Kazuki; Fukunaga, Keiko; Hirakawa, Tsuneaki; Arai, Kunihito; Kitano, Tomoaki; Kosaka, Fumiko; Tamai, Hayato; Nakayama, Kazutaka; Wakita, Satoshi; Fukuda, Takahiro; Inokuchi, Koiti

    2016-01-01

    In recent years, it has been reported that the frequency of DNA-methylation regulatory gene mutations – mutations of the genes that regulate gene expression through DNA methylation – is high in acute myeloid leukemia. The objective of the present study was to elucidate the clinical characteristics and prognosis of acute myeloid leukemia with associated DNA-methylation regulatory gene mutation. We studied 308 patients with acute myeloid leukemia. DNA-methylation regulatory gene mutations were observed in 135 of the 308 cases (43.8%). Acute myeloid leukemia associated with a DNA-methylation regulatory gene mutation was more frequent in older patients (P<0.0001) and in patients with intermediate cytogenetic risk (P<0.0001) accompanied by a high white blood cell count (P=0.0032). DNA-methylation regulatory gene mutation was an unfavorable prognostic factor for overall survival in the whole cohort (P=0.0018), in patients aged ≤70 years, in patients with intermediate cytogenetic risk, and in FLT3-ITD-negative patients (P=0.0409). Among the patients with DNA-methylation regulatory gene mutations, 26.7% were found to have two or more such mutations and prognosis worsened with increasing number of mutations. In multivariate analysis DNA-methylation regulatory gene mutation was an independent unfavorable prognostic factor for overall survival (P=0.0424). However, patients with a DNA-methylation regulatory gene mutation who underwent allogeneic stem cell transplantation in first remission had a significantly better prognosis than those who did not undergo such transplantation (P=0.0254). Our study establishes that DNA-methylation regulatory gene mutation is an important unfavorable prognostic factor in acute myeloid leukemia. PMID:27247325

  5. Molecular cloning, nucleotide sequence and expression of a Sulfolobus solfataricus gene encoding a class II fumarase.

    PubMed

    Colombo, S; Grisa, M; Tortora, P; Vanoni, M

    1994-01-03

    Fumarase catalyzes the interconversion of L-malate and fumarate. A Sulfolobus solfataricus fumarase gene (fumC) was cloned and sequenced. Typical archaebacterial regulatory sites were identified in the region flanking the fumC open reading frame. The fumC gene encodes a protein of 438 amino acids (47,899 Da) which shows several significant similarities with class II fumarases from both eubacterial and eukariotic sources as well as with aspartases. S. solfataricus fumarase expressed in Escherichia coli retains enzymatic activity and its thermostability is comparable to that of S. solfataricus purified enzyme despite a 11 amino acid C-terminal deletion.

  6. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation.

    PubMed

    O'Connor, Timothy R; Bailey, Timothy L

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules-CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for 'other' tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a 'nearest neighbor' heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps.

  7. Single molecule targeted sequencing for cancer gene mutation detection

    PubMed Central

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W.; He, Jiankui

    2016-01-01

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis. PMID:27193446

  8. Assay for transposase-accessible chromatin and circularized chromosome conformation capture, two methods to explore the regulatory landscapes of genes in zebrafish.

    PubMed

    Fernández-Miñán, A; Bessa, J; Tena, J J; Gómez-Skarmeta, J L

    2016-01-01

    Accurate transcriptional control of genes is fundamental for the correct functioning of organs and developmental processes. This control depends on the interplay between the promoter of genes and other noncoding sequences, whose interaction is mediated by 3D chromatin arrangements. Thus, the detailed description of transcriptional regulatory landscapes is essential to understand the mechanisms of transcriptional regulation. However, to achieve that, two important challenges have to be faced: (1) the identification of the noncoding sequences that contribute to gene transcription and (2) the association of these sequences to the respective genes they control. In this chapter, we describe two protocols that allow overcoming these important challenges: the assay for transposase-accessible chromatin using sequencing (ATAC-seq) and circularized chromosome conformation capture (4C-seq). ATAC-seq is a very efficient technique that, using a very low number of cells as starting material, allows the identification of active chromatin regions genome wide, whereas 4C-seq detects the subset of sequences that interact specifically with the promoter of a given gene. When combined, both techniques provide a comprehensive snapshot of the regulatory landscapes of developmental genes. The protocols we present here have been optimized for teleost fish samples, zebrafish and medaka, allowing the in-depth study of transcriptional regulation in these two emerging animal models. Given the amenability and easy genetic manipulation of these two experimental systems, we anticipate that they will be important in revealing general principles of the vertebrate regulatory genome.

  9. The evolutionary origination and diversification of a dimorphic gene regulatory network through parallel innovations in cis and trans.

    PubMed

    Camino, Eric M; Butts, John C; Ordway, Alison; Vellky, Jordan E; Rebeiz, Mark; Williams, Thomas M

    2015-04-01

    The origination and diversification of morphological characteristics represents a key problem in understanding the evolution of development. Morphological traits result from gene regulatory networks (GRNs) that form a web of transcription factors, which regulate multiple cis-regulatory element (CRE) sequences to control the coordinated expression of differentiation genes. The formation and modification of GRNs must ultimately be understood at the level of individual regulatory linkages (i.e., transcription factor binding sites within CREs) that constitute the network. Here, we investigate how elements within a network originated and diversified to generate a broad range of abdominal pigmentation phenotypes among Sophophora fruit flies. Our data indicates that the coordinated expression of two melanin synthesis enzymes, Yellow and Tan, recently evolved through novel CRE activities that respond to the spatial patterning inputs of Hox proteins and the sex-specific input of Bric-à-brac transcription factors. Once established, it seems that these newly evolved activities were repeatedly modified by evolutionary changes in the network's trans-regulators to generate large-scale changes in pigment pattern. By elucidating how yellow and tan are connected to the web of abdominal trans-regulators, we discovered that the yellow and tan abdominal CREs are composed of distinct regulatory inputs that exhibit contrasting responses to the same Hox proteins and Hox cofactors. These results provide an example in which CRE origination underlies a recently evolved novel trait, and highlights how coordinated expression patterns can evolve in parallel through the generation of unique regulatory linkages.

  10. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  11. Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages

    PubMed Central

    2011-01-01

    Background Deficiency of the transcription factor MafB, which is normally expressed in macrophages, can underlie cellular dysfunction associated with a range of autoimmune diseases and arteriosclerosis. MafB has important roles in cell differentiation and regulation of target gene expression; however, the mechanisms of this regulation and the identities of other transcription factors with which MafB interacts remain uncertain. Bioinformatics methods provide a valuable approach for elucidating the nature of these interactions with transcriptional regulatory elements from a large number of DNA sequences. In particular, identification of patterns of co-occurrence of regulatory cis-elements (motifs) offers a robust approach. Results Here, the directional relationships among several functional motifs were evaluated using the Log-linear Graphical Model (LGM) after extraction and search for evolutionarily conserved motifs. This analysis highlighted GATA-1 motifs and 5’AT-rich half Maf recognition elements (MAREs) in promoter regions of 18 genes that were down-regulated in Mafb deficient macrophages. GATA-1 motifs and MafB motifs could regulate expression of these genes in both a negative and positive manner, respectively. The validity of this conclusion was tested with data from a luciferase assay that used a C1qa promoter construct carrying both the GATA-1 motifs and MAREs. GATA-1 was found to inhibit the activity of the C1qa promoter with the GATA-1 motifs and MafB motifs. Conclusions These observations suggest that both the GATA-1 motifs and MafB motifs are important for lineage specific expression of C1qa. In addition, these findings show that analysis of combinations of evolutionarily conserved motifs can be successfully used to identify patterns of gene regulation. PMID:22784578

  12. Differential DNA sequence recognition is a determinant of specificity in homeotic gene action.

    PubMed Central

    Ekker, S C; von Kessler, D P; Beachy, P A

    1992-01-01

    The homeotic genes of Drosophila encode transcriptional regulatory proteins that specify distinct segment identities. Previous studies have implicated the homeodomain as a major determinant of biological specificity within these proteins, but have not established the physical basis of this specificity. We show here that the homeodomains encoded by the Ultrabithorax and Deformed homeotic genes bind optimally to distinct DNA sequences and have mapped the determinants responsible for differential recognition. We further show that relative transactivation by these two proteins in a simple in vivo system can differ by nearly two orders of magnitude. Such differences in DNA sequence recognition and target activation provide a biochemical basis for at least part of the biological specificity of homeotic gene action. Images PMID:1356765

  13. Characterization of 5'-regulatory region of human myostatin gene: regulation by dexamethasone in vitro.

    PubMed

    Ma, K; Mallidis, C; Artaza, J; Taylor, W; Gonzalez-Cadavid, N; Bhasin, S

    2001-12-01

    We cloned and characterized a 3.3-kb fragment containing the 5'-regulatory region of the human myostatin gene. The promoter sequence contains putative muscle growth response elements for glucocorticoid, androgen, thyroid hormone, myogenic differentiation factor 1, myocyte enhancer factor 2, peroxisome proliferator-activated receptor, and nuclear factor-kappaB. To identify sites important for myostatin's gene transcription and regulation, eight deletion constructs were placed in C(2)C(12) and L6 skeletal muscle cells. Transcriptional activity of the constructs was found to be significantly higher in myotubes compared with that of myoblasts. To investigate whether glucocorticoids regulate myostatin gene expression, we incubated both cell lines with dexamethasone. On both occasions, dexamethasone dose dependently increased both the promoter's transcriptional activity and the endogenous myostatin expression. The effects of dexamethasone were blocked when the cells were coincubated with the glucocorticoid receptor antagonist RU-486. These findings suggest that glucocorticoids upregulate myostatin expression by inducing gene transcription, possibly through a glucocorticoid receptor-mediated pathway. We speculate that glucocorticoid-associated muscle atrophy might be due in part to the upregulation of myostatin expression.

  14. Identification of polycomb and trithorax group responsive elements in the regulatory region of the Drosophila homeotic gene Sex combs reduced

    SciTech Connect

    Gindhart, J.G. Jr.; Kaufman, T.C.

    1995-02-01

    The Drosophilia homeotic gene Sex combs reduced (Scr) is necessary for the establishment and maintenance of the morphological identity of the labial and prothoracic segments. In the early embryo, its expression pattern is established through the activity of several gap and segmentation gene products, as well as other transcription factors. Once established, the Polycomb group (Pc-G) and trithorax group (trx-G) gene products maintain the spatial pattern of Scr expression for the remainder of development. We report the identification of DNA fragments in the Scr regulatory region that may be important for its regulation by Polycomb and trithorax group gene products. When DNA fragments containing these regulatory sequences are subcloned into P-element vectors containing a white minigene, transformants containing these constructs exhibit mosaic patterns of pigmentation in the adult eye, indicating that white minigene expression is repressed in a clonally heritable manner. The size of pigmented and nonpigmented clones in the adult eye suggests that the event determining whether a cell in the eye anlagen will express white occurs at least as early as the first larval instar. The amount of white minigene repression is reduced in some Polycomb group mutants, whereas repression is enhanced in flies mutant for a subset of trithorax group loci. The repressor activity of one fragment, normally located in Scr Intron 2, is increased when it is able to homologously pair, a property consistent with genetic data suggesting that Scr exhibits transvection. Another Scr regulatory fragment, normally located 40 kb upstream of the Scr promoter, silences ectopic expression of an Scr-lacZ fusion gene in the embryo and does so in a Polycomb-dependent manner. We propose that the regulatory sequences located within these DNA fragments may normally mediate the regulation of Scr by proteins encoded by members of Polycomb and trithorax group loci. 98 refs., 6 figs., 4 tabs.

  15. Nucleotide sequence and transcriptional analysis of the type A2 neurotoxin gene cluster in Clostridium botulinum.

    PubMed

    Dineen, Sean S; Bradshaw, Marite; Karasek, Charles E; Johnson, Eric A

    2004-06-01

    The nucleotide sequences of the upstream regions of the botulinum neurotoxin type A1 (BoNT/A1) cluster of Clostridium botulinum strain NCTC 2916 and the BoNT/A2 cluster of strain Kyoto-F were determined. A novel gene, designated orfx3, was identified following the orfx2 gene in both clusters. ORF-X2 and ORF-X3 exhibit similarity to the BoNT cluster associated P-47 protein. The BoNT/A1 and BoNT/A2 clusters share a similar gene arrangement, but exhibit differences in the spacing between certain genes. Sequences with similarity to transposases were identified in these intergenic regions, suggesting that these differences arose from an ancestral insertion event. Transcriptional analysis of the BoNT/A2 cluster revealed that the genes of the cluster are primarily synthesized as three polycistronic transcripts. Two divergent polycistronic transcripts, one encoding the orfx1, orfx2, and orfx3 genes, the second encoding the p47, ntnh, and bont/a2 genes, are transcribed from conserved BoNT cluster promoters. The third polycistronic transcript, expressed at low levels, encodes the positive regulatory botR gene and the orfx genes. This is the first complete analysis of a botulinum toxin A2 cluster.

  16. Specific regulatory motifs predict glucocorticoid responsiveness of hippocampal gene expression.

    PubMed

    Datson, N A; Polman, J A E; de Jonge, R T; van Boheemen, P T M; van Maanen, E M T; Welten, J; McEwen, B S; Meiland, H C; Meijer, O C

    2011-10-01

    The glucocorticoid receptor (GR) is an ubiquitously expressed ligand-activated transcription factor that mediates effects of cortisol in relation to adaptation to stress. In the brain, GR affects the hippocampus to modulate memory processes through direct binding to glucocorticoid response elements (GREs) in the DNA. However, its effects are to a high degree cell specific, and its target genes in different cell types as well as the mechanisms conferring this specificity are largely unknown. To gain insight in hippocampal GR signaling, we characterized to which GRE GR binds in the rat hippocampus. Using a position-specific scoring matrix, we identified evolutionary-conserved putative GREs from a microarray based set of hippocampal target genes. Using chromatin immunoprecipitation, we were able to confirm GR binding to 15 out of a selection of 32 predicted sites (47%). The majority of these 15 GREs are previously undescribed and thus represent novel GREs that bind GR and therefore may be functional in the rat hippocampus. GRE nucleotide composition was not predictive for binding of GR to a GRE. A search for conserved flanking sequences that may predict GR-GRE interaction resulted in the identification of GC-box associated motifs, such as Myc-associated zinc finger protein 1, within 2 kb of GREs with GR binding in the hippocampus. This enrichment was not present around nonbinding GRE sequences nor around proven GR-binding sites from a mesenchymal stem-like cell dataset that we analyzed. GC-binding transcription factors therefore may be unique partners for DNA-bound GR and may in part explain cell-specific transcriptional regulation by glucocorticoids in the context of the hippocampus.

  17. A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets

    PubMed Central

    2014-01-01

    Background As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. Results A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. Conclusions The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers. PMID:25350697

  18. A spatially dynamic cohort of regulatory genes in the endomesodermal gene network of the sea urchin embryo.

    PubMed

    Smith, Joel; Kraemer, Ebba; Liu, Hongdau; Theodoris, Christina; Davidson, Eric

    2008-01-15

    A gene regulatory network subcircuit comprising the otx, wnt8, and blimp1 genes accounts for a moving torus of gene expression that sweeps concentrically across the vegetal domain of the sea urchin embryo. Here we confirm by mutation the inputs into the blimp1cis-regulatory module predicted by network analysis. Its essential design feature is that it includes both activation and autorepression sites. The wnt8 gene is functionally linked into the subcircuit in that cells receiving this ligand generate a beta-catenin/Tcf input required for blimp1 expression, while the wnt8 gene in turn requires a Blimp1 input. Their torus-like spatial expression patterns and gene regulatory analysis indicate that the genes even-skipped and hox11/13b are also entrained by this subcircuit. We verify the cis-regulatory inputs of even-skipped predicted by network analysis. These include activation by beta-catenin/Tcf and Blimp1, repression within the torus by Hox11/13b, and repression outside the torus by Tcf in the absence of Wnt8 signal input. Thus even-skipped and hox11/13b, along with blimp1 and wnt8, are members of a cohort of torus genes with similar regulatory inputs and similar, though slightly out-of-phase, expression patterns, which reflect differences in cis-regulatory design.

  19. The regulatory gene areA mediating nitrogen metabolite repression in Aspergillus nidulans. Mutations affecting specificity of gene activation alter a loop residue of a putative zinc finger.

    PubMed Central

    Kudla, B; Caddick, M X; Langdon, T; Martinez-Rossi, N M; Bennett, C F; Sibley, S; Davies, R W; Arst, H N

    1990-01-01

    The regulatory gene areA mediating nitrogen metabolite repression in Aspergillus nidulans has been sequenced and its transcript mapped and orientated. A single ORF can encode a protein of 719 amino acids. A 52 amino acid region including a putative 'zinc finger' strongly resembles putative DNA binding regions of the major regulatory protein of erythroid cells. The derived protein sequence also contains a highly acidic region possibly involved in gene activation and 22 copies of the motif S(T)PXX, abundant in DNA binding proteins. Analysis of chromosomal rearrangements and transformation with deletion clones identified 342 N-terminal and 124 C-terminal residues as inessential and localized a C-terminal region required for nitrogen metabolite repressibility. A -1 frameshift eliminating the inessential 122 C-terminal amino acids is a surprising loss-of-function mutation. Extraordinary basicity of the replacement C terminus might explain its phenotype. Mutant sequencing also identified a polypeptide chain termination and several missense mutations, but most interesting are sequence changes associated with specificity mutations. A mutation elevating expression of some structural genes under areA control whilst reducing or not affecting expression of others is a leucine to valine change in the zinc finger loop. It reverts to a partly reciprocal phenotype by replacing the mutant valine by methionine. Images Fig.2 Fig.4 Fig.5 Fig. 8. Fig. 9. PMID:1970293

  20. Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids

    PubMed Central

    Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

    2015-01-01

    Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221

  1. The nucleosome landscape of Plasmodium falciparum reveals chromatin architecture and dynamics of regulatory sequences.

    PubMed

    Kensche, Philip Reiner; Hoeijmakers, Wieteke Anna Maria; Toenhake, Christa Geeke; Bras, Maaike; Chappell, Lia; Berriman, Matthew; Bártfai, Richárd

    2016-03-18

    In eukaryotes, the chromatin architecture has a pivotal role in regulating all DNA-associated processes and it is central to the control of gene expression. For Plasmodium falciparum, a causative agent of human malaria, the nucleosome positioning profile of regulatory regions deserves particular attention because of their extreme AT-content. With the aid of a highly controlled MNase-seq procedure we reveal how positioning of nucleosomes provides a structural and regulatory framework to the transcriptional unit by demarcating landmark sites (transcription/translation start and end sites). In addition, our analysis provides strong indications for the function of positioned nucleosomes in splice site recognition. Transcription start sites (TSSs) are bordered by a small nucleosome-depleted region, but lack the stereotypic downstream nucleosome arrays, highlighting a key difference in chromatin organization compared to model organisms. Furthermore, we observe transcription-coupled eviction of nucleosomes on strong TSSs during intraerythrocytic development and demonstrate that nucleosome positioning and dynamics can be predictive for the functionality of regulatory DNA elements. Collectively, the strong nucleosome positioning over splice sites and surrounding putative transcription factor binding sites highlights the regulatory capacity of the nucleosome landscape in this deadly human pathogen.

  2. Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.

    PubMed Central

    Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G

    1993-01-01

    The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231

  3. Deciphering RNA Regulatory Elements Involved in the Developmental and Environmental Gene Regulation of Trypanosoma brucei.

    PubMed

    Gazestani, Vahid H; Salavati, Reza

    2015-01-01

    Trypanosoma brucei is a vector-borne parasite with intricate life cycle that can cause serious diseases in humans and animals. This pathogen relies on fine regulation of gene expression to respond and adapt to variable environments, with implications in transmission and infectivity. However, the involved regulatory elements and their mechanisms of actions are largely unknown. Here, benefiting from a new graph-based approach for finding functional regulatory elements in RNA (GRAFFER), we have predicted 88 new RNA regulatory elements that are potentially involved in the gene regulatory network of T. brucei. We show that many of these newly predicted elements are responsive to both transcriptomic and proteomic changes during the life cycle of the parasite. Moreover, we found that 11 of predicted elements strikingly resemble previously identified regulatory elements for the parasite. Additionally, comparison with previously predicted motifs on T. brucei suggested the superior performance of our approach based on the current limited knowledge of regulatory elements in T. brucei.

  4. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates

    PubMed Central

    Kikuta, Hiroshi; Laplante, Mary; Navratilova, Pavla; Komisarczuk, Anna Z.; Engström, Pär G.; Fredman, David; Akalin, Altuna; Caccamo, Mario; Sealy, Ian; Howe, Kerstin; Ghislain, Julien; Pezeron, Guillaume; Mourrain, Philippe; Ellingsen, Staale; Oates, Andrew C.; Thisse, Christine; Thisse, Bernard; Foucher, Isabelle; Adolf, Birgit; Geling, Andrea; Lenhard, Boris; Becker, Thomas S.

    2007-01-01

    We report evidence for a mechanism for the maintenance of long-range conserved synteny across vertebrate genomes. We found the largest mammal-teleost conserved chromosomal segments to be spanned by highly conserved noncoding elements (HCNEs), their developmental regulatory target genes, and phylogenetically and functionally unrelated “bystander” genes. Bystander genes are not specifically under the control of the regulatory elements that drive the target genes and are expressed in patterns that are different from those of the target genes. Reporter insertions distal to zebrafish developmental regulatory genes pax6.1/2, rx3, id1, and fgf8 and miRNA genes mirn9-1 and mirn9-5 recapitulate the expression patterns of these genes even if located inside or beyond bystander genes, suggesting that the regulatory domain of a developmental regulatory gene can extend into and beyond adjacent transcriptional units. We termed these chromosomal segments genomic regulatory blocks (GRBs). After whole genome duplication in teleosts, GRBs, including HCNEs and target genes, were often maintained in both copies, while bystander genes were typically lost from one GRB, strongly suggesting that evolutionary pressure acts to keep the single-copy GRBs of higher vertebrates intact. We show that loss of bystander genes and other mutational events suffered by duplicated GRBs in teleost genomes permits target gene identification and HCNE/target gene assignment. These findings explain the absence of evolutionary breakpoints from large vertebrate chromosomal segments and will aid in the recognition of position effect mutations within human GRBs. PMID:17387144

  5. Exploring the reasons for the large density of triplex-forming oligonucleotide target sequences in the human regulatory regions

    PubMed Central

    Goñi, Josep Ramon; Vaquerizas, Juan Manuel; Dopazo, Joaquin; Orozco, Modesto

    2006-01-01

    Background DNA duplex sequences that can be targets for triplex formation are highly over-represented in the human genome, especially in regulatory regions. Results Here we studied using bioinformatics tools several properties of triplex target sequences in an attempt to determine those that make these sequences so special in the genome. Conclusion Our results strongly suggest that the unique physical properties of these sequences make them particularly suitable as "separators" between protein-recognition sites in the promoter region. PMID:16566817

  6. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    PubMed

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  7. Pi class glutathione S-transferase genes are regulated by Nrf 2 through an evolutionarily conserved regulatory element in zebrafish

    PubMed Central

    Suzuki, Takafumi; Takagi, Yaeko; Osanai, Hitoshi; Li, Li; Takeuchi, Miki; Katoh, Yasutake; Kobayashi, Makoto; Yamamoto, Masayuki

    2005-01-01

    Pi class GSTs (glutathione S-transferases) are a member of the vertebrate GST family of proteins that catalyse the conjugation of GSH to electrophilic compounds. The expression of Pi class GST genes can be induced by exposure to electrophiles. We demonstrated previously that the transcription factor Nrf 2 (NF-E2 p45-related factor 2) mediates this induction, not only in mammals, but also in fish. In the present study, we have isolated the genomic region of zebrafish containing the genes gstp1 and gstp2. The regulatory regions of zebrafish gstp1 and gstp2 have been examined by GFP (green fluorescent protein)-reporter gene analyses using microinjection into zebrafish embryos. Deletion and point-mutation analyses of the gstp1 promoter showed that an ARE (antioxidant-responsive element)-like sequence is located 50 bp upstream of the transcription initiation site which is essential for Nrf 2 transactivation. Using EMSA (electrophoretic mobility-shift assay) analysis we showed that zebrafish Nrf 2–MafK heterodimer specifically bound to this sequence. All the vertebrate Pi class GST genes harbour a similar ARE-like sequence in their promoter regions. We propose that this sequence is a conserved target site for Nrf 2 in the Pi class GST genes. PMID:15654768

  8. Multiple positive and negative 5' regulatory elements control the cell-type-specific expression of the embryonic skeletal myosin heavy-chain gene.

    PubMed Central

    Bouvagnet, P F; Strehler, E E; White, G E; Strehler-Page, M A; Nadal-Ginard, B; Mahdavi, V

    1987-01-01

    To identify the DNA sequences that regulate the expression of the sarcomeric myosin heavy-chain (MHC) genes in muscle cells, a series of deletion constructs of the rat embryonic MHC gene was assayed for transient expression after introduction into myogenic and nonmyogenic cells. The sequences in 1.4 kilobases of 5'-flanking DNA were found to be sufficient to direct expression of the MHC gene constructs in a tissue-specific manner (i.e., in differentiated muscle cells but not in undifferentiated muscle and nonmuscle cells). Three main distinct regulatory domains have been identified: (i) the upstream sequences from positions -1413 to -174, which determine the level of expression of the MHC gene and are constituted of three positive regulatory elements and two negative ones; (ii) a muscle-specific regulatory element from positions -173 to -142, which restricts the expression of the MHC gene to muscle cells; and (iii) the promoter region, downstream from position -102, which directs transcription initiation. Introduction of the simian virus 40 enhancer into constructs where subportions of or all of the upstream sequences are deleted (up to position -173) strongly increases the level of expression of such truncated constructs but without changing their muscle specificity. These upstream sequences, which can be substituted for by the simian virus 40 enhancer, function in an orientation-, position-, and promoter-dependent fashion. The muscle-specific element is also promoter specific but does not support efficient expression of the MHC gene. The MHC promoter in itself is not muscle specific. These results underline the importance of the concerted action of multiple regulatory elements that are likely to represent targets for DNA-binding-regulatory proteins. Images PMID:2830491

  9. Structure and sequence divergence of two archaebacterial genes

    SciTech Connect

    Cue, D.; Beckler, G.S.; Reeve, J.N.; Konisky, J.

    1985-06-01

    The DNA sequences of a region that includes the hisA gene of two related methanogenic archaebacteria, Methanococcus voltae and Methanococcus vannielii, have been compared. Both organisms show a similar genome organization in this region, displaying three open reading frames (ORFs) separated by regions of very high A+T content. Two of the ORFs, including ORFHisA, show significant DNA sequence homology. As might be expected for organisms having a genome that is A+T-rich, there is a high preference for A and U as the third base in codons. A ribosome binding site, G-G-T-G, is located 6 base pairs preceding the ATG translation initiation sequence of both hisA genes. The sequences upstream of the two hisA genes show only limited sequence homology. The M. voltae intergenic region contains four tandemly arranged repetitions of an 11-base-pair sequence, whereas the M. vannielii sequence contains both direct and inverted repetitive sequences. Based on the degree of hisA sequence homology, the authors conclude that M. voltae and M. vannielii are less closely related taxonomically than are members of the enteric group of eubacteria.

  10. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  11. Cis-Regulatory Control of the Nuclear Receptor Coup-TF Gene in the Sea Urchin Paracentrotus lividus Embryo

    PubMed Central

    Kalampoki, Lamprini G.; Flytzanis, Constantin N.

    2014-01-01

    Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN). The Paracentrotus lividus Coup-TF gene (PlCoup-TF) is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5′ deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (−532 to −232), was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5′ and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3) within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1) or ectopic expression (RE2, RE3). It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN. PMID:25386650

  12. Cis-regulatory control of the nuclear receptor Coup-TF gene in the sea urchin Paracentrotus lividus embryo.

    PubMed

    Kalampoki, Lamprini G; Flytzanis, Constantin N

    2014-01-01

    Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN). The Paracentrotus lividus Coup-TF gene (PlCoup-TF) is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5' deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (-532 to -232), was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5' and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3) within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1) or ectopic expression (RE2, RE3). It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN.

  13. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1997-01-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  14. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags

    SciTech Connect

    Xu, Y.; Mural, R.; Uberbacher, E.

    1997-02-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  15. Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks

    PubMed Central

    2012-01-01

    Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs). As a logical model, probabilistic Boolean networks (PBNs) consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n) or O(nN2n) for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN). An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n), where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a network inferred from a T

  16. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  17. Updated Sequence Information for TEM β-Lactamase Genes

    PubMed Central

    Goussard, Sylvie; Courvalin, Patrice

    1999-01-01

    The sequences of the promoter regions and of the structural genes for 13 penicillinase, extended-spectrum, and inhibitor-resistant TEM-type β-lactamases have been determined, and an updated blaTEM gene nomenclature is proposed. PMID:9925535

  18. (Gene sequencing by scanning molecular exciton microscopy)

    SciTech Connect

    Not Available

    1991-01-01

    This report details progress made in setting up a laboratory for optical microscopy of genes. The apparatus including a fluorescence microscope, a scanning optical microscope, various spectrometers, and supporting computers is described. Results in developing photon and exciton tips, and in preparing samples are presented. (GHH)

  19. The Association between Infants' Self-Regulatory Behavior and MAOA Gene Polymorphism

    ERIC Educational Resources Information Center

    Zhang, Minghao; Chen, Xinyin; Way, Niobe; Yoshikawa, Hirokazu; Deng, Huihua; Ke, Xiaoyan; Yu, Weiwei; Chen, Ping; He, Chuan; Chi, Xia; Lu, Zuhong

    2011-01-01

    Self-regulatory behavior in early childhood is an important characteristic that has considerable implications for the development of adaptive and maladaptive functioning. The present study investigated the relations between a functional polymorphism in the upstream region of monoamine oxidase A gene (MAOA) and self-regulatory behavior in a sample…

  20. Gene regulation: ancient microRNA target sequences in plants.

    PubMed

    Floyd, Sandra K; Bowman, John L

    2004-04-01

    MicroRNAs are an abundant class of small RNAs that are thought to regulate the expression of protein-coding genes in plants and animals. Here we show that the target sequence of two microRNAs, known to regulate genes in the class-III homeodomain-leucine zipper (HD-Zip) gene family of the flowering plant Arabidopsis, is conserved in homologous sequences from all lineages of land plants, including bryophytes, lycopods, ferns and seed plants. We also find that the messenger RNAs from these genes are cleaved within the same microRNA-binding site in representatives of each land-plant group, as they are in Arabidopsis. Our results indicate not only that microRNAs mediate gene regulation in non-flowering as well as flowering plants, but also that the regulation of this class of plant genes dates back more than 400 million years.

  1. Cell type-selective disease-association of genes under high regulatory load.

    PubMed

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-10-15

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3' UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner.

  2. Cell type-selective disease-association of genes under high regulatory load

    PubMed Central

    Galhardo, Mafalda; Berninger, Philipp; Nguyen, Thanh-Phuong; Sauter, Thomas; Sinkkonen, Lasse

    2015-01-01

    We previously showed that disease-linked metabolic genes are often under combinatorial regulation. Using the genome-wide ChIP-Seq binding profiles for 93 transcription factors in nine different cell lines, we show that genes under high regulatory load are significantly enriched for disease-association across cell types. We find that transcription factor load correlates with the enhancer load of the genes and thereby allows the identification of genes under high regulatory load by epigenomic mapping of active enhancers. Identification of the high enhancer load genes across 139 samples from 96 different cell and tissue types reveals a consistent enrichment for disease-associated genes in a cell type-selective manner. The underlying genes are not limited to super-enhancer genes and show several types of disease-association evidence beyond genetic variation (such as biomarkers). Interestingly, the high regulatory load genes are involved in more KEGG pathways than expected by chance, exhibit increased betweenness centrality in the interaction network of liver disease genes, and carry longer 3′ UTRs with more microRNA (miRNA) binding sites than genes on average, suggesting a role as hubs integrating signals within regulatory networks. In summary, epigenetic mapping of active enhancers presents a promising and unbiased approach for identification of novel disease genes in a cell type-selective manner. PMID:26338775

  3. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    PubMed Central

    Santra, Tapesh

    2014-01-01

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances. PMID:25152886

  4. Gene regulatory network inference using fused LASSO on multiple data sets.

    PubMed

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M O; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-02-11

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.

  5. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes.

  6. Conserved sequence motifs upstream from the co-ordinately expressed vitellogenin and apoVLDLII genes of chicken.

    PubMed

    van het Schip, F; Strijker, R; Samallo, J; Gruber, M; Geert, A B

    1986-11-11

    The vitellogenin and apoVLDLII yolk protein genes of chicken are transcribed in the liver upon estrogenization. To get information on putative regulatory elements, we compared more than 2 kb of their 5' flanking DNA sequences. Common sequence motifs were found in regions exhibiting estrogen-induced changes in chromatin structure. Stretches of alternating pyrimidines and purines of about 30-nucleotides long are present at roughly similar positions. A distinct box of sequence homology in the chicken genes also appears to be present at a similar position in front of the vitellogenin genes of Xenopus laevis, but is absent from the estrogen-responsive egg-white protein genes expressed in the oviduct. In front of the vitellogenin (position -595) and the VLDLII gene (position -548), a DNA element of about 300 base-pairs was found, which possesses structural characteristics of a mobile genetic element and bears homology to the transposon-like Vi element of Xenopus laevis.

  7. Reconstructing differentially co-expressed gene modules and regulatory networks of soybean cells

    PubMed Central

    2012-01-01

    Background Current experimental evidence indicates that functionally related genes show coordinated expression in order to perform their cellular functions. In this way, the cell transcriptional machinery can respond optimally to internal or external stimuli. This provides a research opportunity to identify and study co-expressed gene modules whose transcription is controlled by shared gene regulatory networks. Results We developed and integrated a set of computational methods of differential gene expression analysis, gene clustering, gene network inference, gene function prediction, and DNA motif identification to automatically identify differentially co-expressed gene modules, reconstruct their regulatory networks, and validate their correctness. We tested the methods using microarray data derived from soybean cells grown under various stress conditions. Our methods were able to identify 42 coherent gene modules within which average gene expression correlation coefficients are greater than 0.8 and reconstruct their putative regulatory networks. A total of 32 modules and their regulatory networks were further validated by the coherence of predicted gene functions and the consistency of putative transcription factor binding motifs. Approximately half of the 32 modules were partially supported by the literature, which demonstrates that the bioinformatic methods used can help elucidate the molecular responses of soybean cells upon various environmental stresses. Conclusions The bioinformatics methods and genome-wide data sources for gene expression, clustering, regulation, and function analysis were integrated seamlessly into one modular protocol to systematically analyze and infer modules and networks from only differential expression genes in soybean cells grown under stress conditions. Our approach appears to effectively reduce the complexity of the problem, and is sufficiently robust and accurate to generate a rather complete and detailed view of putative soybean

  8. The nucleotide sequence of the human beta-globin gene.

    PubMed

    Lawn, R M; Efstratiadis, A; O'Connell, C; Maniatis, T

    1980-10-01

    We report the complete nucleotide sequence of the human beta-globin gene. The purpose of this study is to obtain information necessary to study the evolutionary relationships between members of the human beta-like globin gene family and to provide the basis for comparing normal beta-globin genes with those obtained from the DNA of individuals with genetic defects in hemoglobin expression.

  9. Coelacanth genome sequence reveals the evolutionary history of vertebrate genes.

    PubMed

    Noonan, James P; Grimwood, Jane; Danke, Joshua; Schmutz, Jeremy; Dickson, Mark; Amemiya, Chris T; Myers, Richard M

    2004-12-01

    The coelacanth is one of the nearest living relatives of tetrapods. However, a teleost species such as zebrafish or Fugu is typically used as the outgroup in current tetrapod comparative sequence analyses. Such studies are complicated by the fact that teleost genomes have undergone a whole-genome duplication event, as well as individual gene-duplication events. Here, we demonstrate the value of coelacanth genome sequence by complete sequencing and analysis of the protocadherin gene cluster of the Indonesian coelacanth, Latimeria menadoensis. We found that coelacanth has 49 protocadherin cluster genes organized in the same three ordered subclusters, alpha, beta, and gamma, as the 54 protocadherin cluster genes in human. In contrast, whole-genome and tandem duplications have generated two zebrafish protocadherin clusters comprised of at least 97 genes. Additionally, zebrafish protocadherins are far more prone to homogenizing gene conversion events than coelacanth protocadherins, suggesting that recombination- and duplication-driven plasticity may be a feature of teleost genomes. Our results indicate that coelacanth provides the ideal outgroup sequence against which tetrapod genomes can be measured. We therefore present L. menadoensis as a candidate for whole-genome sequencing.

  10. Noncoding RNA gene detection using comparative sequence analysis

    PubMed Central

    Rivas, Elena; Eddy, Sean R

    2001-01-01

    Background Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. Results We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. Conclusions We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability. PMID:11801179

  11. A saturation screen for cis-acting regulatory DNA in the Hox genes of Ciona intestinalis

    SciTech Connect

    Keys, David N.; Lee, Byung-in; Di Gregorio, Anna; Harafuji, Naoe; Detter, Chris; Wang, Mei; Kahsai, Orsalem; Ahn, Sylvia; Arellano, Andre; Zhang, Quin; Trong, Stephan; Doyle, Sharon A.; Satoh, Noriyuki; Satou, Yutaka; Saiga, Hidetoshi; Christian, Allen; Rokhsar, Dan; Hawkins, Trevor L.; Levine, Mike; Richardson, Paul

    2005-01-05

    A screen for the systematic identification of cis-regulatory elements within large (>100 kb) genomic domains containing Hox genes was performed by using the basal chordate Ciona intestinalis. Randomly generated DNA fragments from bacterial artificial chromosomes containing two clusters of Hox genes were inserted into a vector upstream of a minimal promoter and lacZ reporter gene. A total of 222 resultant fusion genes were separately electroporated into fertilized eggs, and their regulatory activities were monitored in larvae. In sum, 21 separable cis-regulatory elements were found. These include eight Hox linked domains that drive expression in nested anterior-posterior domains of ectodermally derived tissues. In addition to vertebrate-like CNS regulation, the discovery of cis-regulatory domains that drive epidermal transcription suggests that C. intestinalis has arthropod-like Hox patterning in the epidermis.

  12. Transcriptome Analysis of an Insecticide Resistant Housefly Strain: Insights about SNPs and Regulatory Elements in Cytochrome P450 Genes

    PubMed Central

    Asp, Torben; Kristensen, Michael

    2016-01-01

    Background Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. Results The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Conclusion Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s

  13. SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

    NASA Astrophysics Data System (ADS)

    Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

    2015-09-01

    The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.

  14. Nucleotide sequence of the pyruvate decarboxylase gene from Zymomonas mobilis.

    PubMed

    Neale, A D; Scopes, R K; Wettenhall, R E; Hoogenraad, N J

    1987-02-25

    Pyruvate decarboxylase (EC 4.1.1.1), the penultimate enzyme in the alcoholic fermentation pathway of Zymomonas mobilis, converts pyruvate to acetaldehyde and carbon dioxide. The complete nucleotide sequence of the structural gene encoding pyruvate decarboxylase from Zymomonas mobilis has been determined. The coding region is 1704 nucleotides long and encodes a polypeptide of 567 amino acids with a calculated subunit mass of 60,790 daltons. The amino acid sequence was confirmed by comparison with the amino acid sequence of a selection of tryptic fragments of the enzyme. The amino acid composition obtained from the nucleotide sequence is in good agreement with that obtained experimentally.

  15. Negative regulatory elements upstream of a novel exon of the neuronal nicotinic acetylcholine receptor alpha 2 subunit gene.

    PubMed Central

    Bessis, A; Savatier, N; Devillers-Thiéry, A; Bejanin, S; Changeux, J P

    1993-01-01

    The expression of the nicotinic acetylcholine receptor alpha 2 subunit gene is highly restricted to the Spiriform lateralis nucleus of the Chick diencephalon. As a first step toward understanding the molecular mechanism underlying this regulation, we have investigated the structural and regulatory properties of the 5' sequence of this gene. A strategy based on the ligation of an oligonucleotide to the first strand of the cDNA (SLIC) followed by PCR amplification was used. A new exon was found approximately 3kb upstream from the first coding exon, and multiple transcription start sites of the gene were mapped. Analysis of the flanking region shows many consensus sequences for the binding of nuclear proteins, suggesting that the 1 kb flanking region contains at least a portion of the promoter of the gene. We have analysed the negative regulatory elements present within this region and found that a silencer region located between nucleotide -144 and +76 is active in fibroblasts as well as in neurons. This silencer is composed of six tandem repeat Oct-like motifs (CCCCATGCAAT), but does not bind any member of the Oct family. Moreover these motifs were found to act as a silencer only when they were tandemly repeated. When two, four or five motifs were deleted, the silencer activity of the motifs unexpectedly became an enhancer activity in all cells we have tested. Images PMID:8502560

  16. Understanding the transcriptional regulation of cervix cancer using microarray gene expression data and promoter sequence analysis of a curated gene set.

    PubMed

    Srivastava, Prashant; Mangal, Manu; Agarwal, Subhash Mohan

    2014-02-10

    Cervical cancer, the malignant neoplasm of the cervix uteri is the second most common cancer among women worldwide and the top-most cancer in India. Several factors are responsible for causing cervical cancer, which alter the expression of oncogenic genes resulting in up or down-regulation of gene expression and inactivation of tumor-suppressor genes/gene products. Gene expression is regulated by interactions between transcription factors (TFs) and specific regulatory elements in the promoter regions of target genes. Thus, it is important to decipher and analyze TFs that bind to regulatory regions of diseased genes and regulate their expression. In the present study, computational methods involving the combination of gene expression data from microarray experiments and promoter sequence analysis of a curated gene set involved in the cervical cancer causation have been utilized for identifying potential regulatory elements. Consensus predictions of two approaches led to the identification of twelve TFs that might be crucial to the regulation of cervical cancer progression. Subsequently, TF enrichment and oncomine expression analysis suggested that the transcription factor family E2F played an important role for the regulation of genes involve in cervical carcinogenesis. Our results suggest that E2F possesses diagnostic/prognostic value and can act as a potential drug target in cervical cancer.

  17. Reverse engineering and analysis of genome-wide gene regulatory networks from gene expression profiles using high-performance computing.

    PubMed

    Belcastro, Vincenzo; Gregoretti, Francesco; Siciliano, Velia; Santoro, Michele; D'Angelo, Giovanni; Oliva, Gennaro; di Bernardo, Diego

    2012-01-01

    Regulation of gene expression is a carefully regulated phenomenon in the cell. “Reverse-engineering” algorithms try to reconstruct the regulatory interactions among genes from genome-scale measurements of gene expression profiles (microarrays). Mammalian cells express tens of thousands of genes; hence, hundreds of gene expression profiles are necessary in order to have acceptable statistical evidence of interactions between genes. As the number of profiles to be analyzed increases, so do computational costs and memory requirements. In this work, we designed and developed a parallel computing algorithm to reverse-engineer genome-scale gene regulatory networks from thousands of gene expression profiles. The algorithm is based on computing pairwise Mutual Information between each gene-pair. We successfully tested it to reverse engineer the Mus Musculus (mouse) gene regulatory network in liver from gene expression profiles collected from a public repository. A parallel hierarchical clustering algorithm was implemented to discover “communities” within the gene network. Network communities are enriched for genes involved in the same biological functions. The inferred network was used to identify two mitochondrial proteins.

  18. Coupling sequencing by hybridization (SBH) with gel sequencing for an inexpensive analysis of genes and genomes

    SciTech Connect

    Drmanac, S.; Labat, I.; Hauser, B.; Drmanac, R.

    1996-11-01

    The speed and cost of DNA sequencing are bottlenecks in the analysis of genes end genomes. Sequencing by hybridization (SBH) is a versatile method with several applications which can accelerated DNA screening, mapping and sequencing. Requirements, achievements and problems in the development of the SBH format 1 (DNA samples arrayed) are presented and schemes for its synergetic coupling with gel sequencing techniques are discussed. It appears that by one hybridization machine with 24 boxes and four ABI gel sequencers 100- 300 Mb of DNA sequence can be determined per year. Various genetic studies based on computer assisted analysis of large collections of partial or complete DNA sequences (`sequenetics`) may be achieved in this century.

  19. Inter-specific sequence conservation and intra-individual sequence variation in a spider silk gene.

    PubMed

    Tai, Pei-Ling; Hwang, Guang-Yuh; Tso, I-Min

    2004-10-01

    Currently, studies on major ampullate spidroin 1 (MaSp1) genes of non-orb weaving spiders are few, and it is not clear whether genes of these organisms exhibit the same characteristics as those of orb-weavers. In addition, many studies have proposed that MaSp1 might be a single gene with allelic variants, but supporting evidence is still lacking. In this study, we compared partial DNA and amino acid sequences of MaSp1 cloned from different spider guilds. We also cloned partial MaSp1 sequences from genomic DNA and cDNA of the same individuals of spiders using the same primer combination to see if different molecular forms existed. In the repetitive region of partial MaSp1 sequences obtained, GGX, GA and poly-A motifs were present in all Araneomorphae and Mygalomorpae species examined. An extreme similarity in MaSp1 non-repetitive portions was found in sequences of ecribellate, cribellate and Mygalomorphae web-builders and such a result suggested that this sequence might exhibit an important function. A comparison of sequences amplified from the same individual showed that substitutions in amino acids occurred in both repetitive and non-repetitive regions, with a much higher variation in the former. These results suggest that the MaSp1 of Araneomorphae spiders exhibits several forms in an individual spider and it might be either a multiple gene or a single gene with a multiple exon/intron organization.

  20. A distinct regulatory region of the Bmp5 locus activates gene expression following adult bone fracture or soft tissue injury.

    PubMed

    Guenther, Catherine A; Wang, Zhen; Li, Emma; Tran, Misha C; Logan, Catriona Y; Nusse, Roel; Pantalena-Filho, Luiz; Yang, George P; Kingsley, David M

    2015-08-01

    Bone morphogenetic proteins (BMPs) are key signaling molecules required for normal development of bones and other tissues. Previous studies have shown that null mutations in the mouse Bmp5 gene alter the size, shape and number of multiple bone and cartilage structures during development. Bmp5 mutations also delay healing of rib fractures in adult mutants, suggesting that the same signals used to pattern embryonic bone and cartilage are also reused during skeletal regeneration and repair. Despite intense interest in BMPs as agents for stimulating bone formation in clinical applications, little is known about the regulatory elements that control developmental or injury-induced BMP expression. To compare the DNA sequences that activate gene expression during embryonic bone formation and following acute injuries in adult animals, we assayed regions surrounding the Bmp5 gene for their ability to stimulate lacZ reporter gene expression in transgenic mice. Multiple genomic fragments, distributed across the Bmp5 locus, collectively coordinate expression in discrete anatomic domains during normal development, including in embryonic ribs. In contrast, a distinct regulatory region activated expression following rib fracture in adult animals. The same injury control region triggered gene expression in mesenchymal cells following tibia fracture, in migrating keratinocytes following dorsal skin wounding, and in regenerating epithelial cells following lung injury. The Bmp5 gene thus contains an "injury response" control region that is distinct from embryonic enhancers, and that is activated by multiple types of injury in adult animals.

  1. Transfer of a large gene regulatory apparatus to a new developmental address in echinoid evolution.

    PubMed

    Gao, Feng; Davidson, Eric H

    2008-04-22

    Of the five echinoderm classes, only the modern sea urchins (euechinoids) generate a precociously specified embryonic micromere lineage that ingresses before gastrulation and then secretes the biomineral embryonic skeleton. The gene regulatory network (GRN) underlying the specification and differentiation of this lineage is now known. Many of the same differentiation genes as are used in the biomineralization of the embryo skeleton are also used to make the similar biomineral of the spines and test plates of the adult body. Here, we determine the components of the regulatory state upstream of these differentiation genes that are shared between embryonic and adult skeletogenesis. An abrupt "break point" in the micromere GRN is thus revealed, on one side of which most of the regulatory genes are used in both, and on the other side of which the regulatory apparatus is entirely micromere-specific. This reveals the specific linkages of the micromere GRN forged in the evolutionary process by which the skeletogenic gene batteries were caused to be activated in the embryonic micromere lineage. We also show, by comparison with adult skeletogenesis in the sea star, a distant echinoderm outgroup, that the regulatory apparatus responsible for driving the skeletogenic differentiation gene batteries is an ancient pleisiomorphic aspect of the echinoderm-specific regulatory heritage.

  2. Isolation and sequence analysis of the gene encoding triose phosphate isomerase from Zygosaccharomyces bailii.

    PubMed

    Merico, A; Rodrigues, F; Côrte-Real, M; Porro, D; Ranzi, B M; Compagno, C

    2001-06-30

    The ZbTPI1 gene encoding triose phosphate isomerase (TIM) was cloned from a Zygosaccharomyces bailii genomic library by complementation of the Saccharomyces cerevisiae tpi1 mutant strain. The nucleotide sequence of a 1.5 kb fragment showed an open reading frame (ORF) of 746 bp, encoding a protein of 248 amino acid residues. The deduced amino acid sequence shares a high degree of homology with TIMs from other yeast species, including some highly conserved regions. The analysis of the promoter sequence of the ZbTPI1 revealed the presence of putative motifs known to have regulatory functions in S. cerevisiae. The GenBank Accession No. of ZbTPI1 is AF325852.

  3. Natural selection on coding and noncoding DNA sequences is associated with virulence genes in a plant pathogenic fungus.

    PubMed

    Rech, Gabriel E; Sanz-Martín, José M; Anisimova, Maria; Sukno, Serenella A; Thon, Michael R

    2014-09-04

    Natural selection leaves imprints on DNA, offering the opportunity to identify functionally important regions of the genome. Identifying the genomic regions affected by natural selection within pathogens can aid in the pursuit of effective strategies to control diseases. In this study, we analyzed genome-wide patterns of selection acting on different classes of sequences in a worldwide sample of eight strains of the model plant-pathogenic fungus Colletotrichum graminicola. We found evidence of selective sweeps, balancing selection, and positive selection affecting both protein-coding and noncoding DNA of pathogenicity-related sequences. Genes encoding putative effector proteins and secondary metabolite biosynthetic enzymes show evidence of positive selection acting on the coding sequence, consistent with an Arms Race model of evolution. The 5' untranslated regions (UTRs) of genes coding for effector proteins and genes upregulated during infection show an excess of high-frequency polymorphisms likely the consequence of balancing selection and consistent with the Red Queen hypothesis of evolution acting on these putative regulatory sequences. Based on the findings of this work, we propose that even though adaptive substitutions on coding sequences are important for proteins that interact directly with the host, polymorphisms in the regulatory sequences may confer flexibility of gene expression in the virulence processes of this important plant pathogen.

  4. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data.

    PubMed

    Faria, José P; Overbeek, Ross; Taylor, Ronald C; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same "ON" and "OFF" gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions

  5. Reconstruction of the regulatory network for Bacillus subtilis and reconciliation with gene expression data

    DOE PAGES

    Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.; ...

    2016-03-18

    Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome

  6. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data

    PubMed Central

    Faria, José P.; Overbeek, Ross; Taylor, Ronald C.; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S.

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental

  7. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  8. Sequence diversity of mating-type genes in Phaeosphaeria avenaria.

    PubMed

    Ueng, Peter P; Dai, Qun; Cui, Kai-rong; Czembor, Paweł C; Cunfer, Barry M; Tsang, H; Arseniuk, Edward; Bergstrom, Gary C

    2003-05-01

    Phaeosphaeria avenaria, one of the causal agents of stagonospora leaf blotch diseases in cereals, is composed of two subspecies, P. avenaria f. sp. triticea (Pat) and P. avenaria f. sp. avenaria (Paa). The Pat subspecies was grouped into Pat1-Pat3, based on restriction fragment length polymorphism (RFLP) and ribosomal DNA (rDNA) internal transcribed spacer (ITS) sequences in previous studies. Mating-type genes and their potential use in phylogeny and molecular classification were studied by DNA hybridization and PCR amplification. The majority of Pat1 isolates reported to be homothallic and producing sexual reproduction structures on cultural media had only the MAT1-1 gene. Minor sequence variations were found in the conserved region of MAT1-1 gene in Pat1 isolates. However, both mating-type genes, MAT1-1 and MAT1-2, were identified in P. avenaria isolates represented by ATCC12277 from oats (Paa) and the Pat2 isolates from foxtail barley ( Hordeum jubatum L.). Cluster analyses based on mating-type gene conserved regions revealed that cereal Phaeosphaeria is not phylogenetically closely related to other ascomycetes, including Mycosphaerella graminicola (anamorph Septoria tritici). The sequence diversity of mating-type genes in Pat and Paa supports our previous phylogenetic relationship and molecular classification based on RFLP fingerprinting and rDNA ITS sequences.

  9. Widespread contribution of transposable elements to the innovation of gene regulatory networks

    PubMed Central

    Sundaram, Vasavi; Cheng, Yong; Ma, Zhihai; Li, Daofeng; Xing, Xiaoyun; Edge, Peter

    2014-01-01

    Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF–TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution. PMID:25319995

  10. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development.

  11. Sequences contained within the promoter of the human thymidine kinase gene can direct cell-cycle regulation of heterologous fusion genes.

    PubMed Central

    Kim, Y K; Wells, S; Lau, Y F; Lee, A S

    1988-01-01

    Recent evidence on the transcriptional regulation of the human thymidine kinase (TK) gene raises the possibility that cell-cycle regulatory sequences may be localized within its promoter. A hybrid gene that combines the TK 5' flanking sequence and the coding region of the bacterial neomycin-resistance gene (neo) has been constructed. Upon transfection into a hamster fibroblast cell line K12, the hybrid gene exhibits cell-cycle-dependent expression. Deletion analysis reveals that the region important for cell-cycle regulation is within -441 to -63 nucleotides from the transcriptional initiation site. This region (-441 to -63) also confers cell-cycle regulation to the herpes simplex virus thymidine kinase (HSVtk) promoter, which is not expressed in a cell-cycle manner. We conclude that the -441 to -63 sequence within the human TK promoter is important for cell-cycle-dependent expression. Images PMID:3413063

  12. Sequences contained within the promoter of the human thymidine kinase gene can direct cell-cycle regulation of heterologous fusion genes

    SciTech Connect

    Kim, Yongkyu; Wells, S.; Lau, Yunfai Chris; Lee, A.S. )

    1988-08-01

    Recent evidence on the transcriptional regulation of the human thymidine kinase (TK) gene raises the possibility that cell-cycle regulatory sequences may be localized within its promoter. A hybrid gene that combines the TK 5{prime} flanking sequence and the coding region of the bacterial neomycin-resistance gene (neo) has been constructed. Upon transfection into a hamster fibroblast cell line K12, the hybrid gene exhibits cell-cycle-dependent expression. Deletion analysis reveals that the region important for cell-cycle regulation is within {minus}441 to {minus}63 nucleotides from the transcriptional initiation site. This region ({minus}441 to {minus}63) also confers cell-cycle regulation to the herpes simplex virus thymidine kinase (HSVtk) promoter, which is not expressed in a cell-cycle manner. The authors conclude that the {minus}441 to {minus}63 sequence within the human TK promoter is important for cell-cycle-dependent expression.

  13. An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series.

    PubMed

    Wang, Zidong; Liu, Xiaohui; Liu, Yurong; Liang, Jinling; Vinciotti, Veronica

    2009-01-01

    In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels. It is shown that the EKF algorithm is an online estimation algorithm that can identify a large number of parameters (including parameters of nonlinear functions) through iterative procedure by using a small number of observations. Four real-world gene expression data sets are employed to demonstrate the effectiveness of the EKF algorithm, and the obtained models are evaluated from the viewpoint of bioinformatics.

  14. A Boolean Model of the Cardiac Gene Regulatory Network Determining First and Second Heart Field Identity

    PubMed Central

    Zhou, Dao; Kestler, Hans A.; Kühl, Michael

    2012-01-01

    Two types of distinct cardiac progenitor cell populations can be identified during early heart development: the first heart field (FHF) and second heart field (SHF) lineage that later form the mature heart. They can be characterized by differential expression of transcription and signaling factors. These regulatory factors influence each other forming a gene regulatory network. Here, we present a core gene regulatory network for early cardiac development based on published temporal and spatial expression data of genes and their interactions. This gene regulatory network was implemented in a Boolean computational model. Simulations reveal stable states within the network model, which correspond to the regulatory states of the FHF and the SHF lineages. Furthermore, we are able to reproduce the expected temporal expression patterns of early cardiac factors mimicking developmental progression. Additionally, simulations of knock-down experiments within our model resemble published phenotypes of mutant mice. Consequently, this gene regulatory network retraces the early steps and requirements of cardiogenic mesoderm determination in a way appropriate to enhance the understanding of heart development. PMID:23056457

  15. Gene regulatory evolution and the origin of macroevolutionary novelties: insights from the neural crest.

    PubMed

    Van Otterloo, Eric; Cornell, Robert A; Medeiros, Daniel Meulemans; Garnett, Aaron T

    2013-07-01

    The appearance of novel anatomic structures during evolution is driven by changes to the networks of transcription factors, signaling pathways, and downstream effector genes controlling development. The nature of the changes to these developmental gene regulatory networks (GRNs) is poorly understood. A striking test case is the evolution of the GRN controlling development of the neural crest (NC). NC cells emerge from the neural plate border (NPB) and contribute to multiple adult structures. While all chordates have a NPB, only in vertebrates do NPB cells express all the genes constituting the neural crest GRN (NC-GRN). Interestingly, invertebrate chordates express orthologs of NC-GRN components in other tissues, revealing that during vertebrate evolution new regulatory connections emerged between transcription factors primitively expressed in the NPB and genes primitively expressed in other tissues. Such interactions could have evolved by two mechanisms. First, transcription factors primitively expressed in the NPB may have evolved new DNA and/or cofactor binding properties (protein neofunctionalization). Alternately, cis-regulatory elements driving NPB expression may have evolved near genes primitively expressed in other tissues (cis-regulatory neofunctionalization). Here we discuss how gene duplication can, in principle, promote either form of neofunctionalization. We review recent published examples of interspecies gene-swap, or regulatory-element-swap, experiments that test both models. Such experiments have yielded little evidence to support the importance of protein neofunctionalization in the emergence of the NC-GRN, but do support the importance of novel cis-regulatory elements in this process. The NC-GRN is an excellent model for the study of gene regulatory and macroevolutionary innovation.

  16. Mercuric ion-resistance operons of plasmid R100 and transposon Tn501: the beginning of the operon including the regulatory region and the first two structural genes.

    PubMed Central

    Misra, T K; Brown, N L; Fritzinger, D C; Pridmore, R D; Barnes, W M; Haberstroh, L; Silver, S

    1984-01-01

    The mercuric ion-resistance operons of plasmid R100 (originally from Shigella) and transposon Tn501 (originally from a plasmid isolated in Pseudomonas) have been compared by DNA sequence analysis. The sequences for the first 1340 base pairs of Tn501 are given with the best alignment with the comparable 1319 base pairs of R100. The homology between the two sequences starts at base 58 after the end of the insertion sequence IS-1 of R100. The sequences include the transcriptional regulatory region, and the homology is particularly strong in regions just upstream from potential transcriptional initiation sites. The trans-acting regulatory gene merR consists of 180 base pairs in both cases and codes for a highly basic polypeptide of 60 amino acids, which is also rich in serine. The Tn501 and R100 merR genes differ in 25 of the 180 base positions, and the resulting polypeptides differ in seven amino acids. The regulatory region before the major transcription initiation site contains potential -35 and -10 sequences and dyad symmetrical sequences, which may be the merR binding sites for transcriptional regulation. The first structural gene, merT, encodes a highly hydrophobic polypeptide of 116 amino acids. The R100 and Tn501 merT genes differ in 17% of their positions, leading to 14 (12%) amino acid changes. This region had previously been shown to encode a protein governing membrane transport of mercuric ions. The second structural gene, merC, would give a 91 amino acid polypeptide with a hydrophobic amino-terminal segment. The Tn501 and R100 merC genes differ at 37 base positions, leading to 10 amino acid changes. PMID:6091128

  17. Illumina MiSeq sequencing disfavours a sequence motif in the GFP reporter gene

    PubMed Central

    Van den Hoecke, Silvie; Verhelst, Judith; Saelens, Xavier

    2016-01-01

    Green fluorescent protein (GFP) is one of the most used reporter genes. We have used next-generation sequencing (NGS) to analyse the genetic diversity of a recombinant influenza A virus that expresses GFP and found a remarkable coverage dip in the GFP coding sequence. This coverage dip was present when virus-derived RT-PCR product or the parental plasmid DNA was used as starting material for NGS and regardless of whether Nextera XT transposase or Covaris shearing was used for DNA fragmentation. Therefore, the sequence coverage dip in the GFP coding sequence was not the result of emerging GFP mutant viruses or a bias introduced by Nextera XT fragmentation. Instead, we found that the Illumina MiSeq sequencing method disfavours the ‘CCCGCC’ motif in the GFP coding sequence. PMID:27193250

  18. Combining Hi-C data with phylogenetic correlation to predict the target genes of distal regulatory elements in human genome.

    PubMed

    Lu, Yulan; Zhou, Yuanpeng; Tian, Weidong

    2013-12-01

    Defining the target genes of distal regulatory elements (DREs), such as enhancer, repressors and insulators, is a challenging task. The recently developed Hi-C technology is designed to capture chromosome conformation structure by high-throughput sequencing, and can be potentially used to determine the target genes of DREs. However, Hi-C data are noisy, making it difficult to directly use Hi-C data to identify DRE-target gene relationships. In this study, we show that DREs-gene pairs that are confirmed by Hi-C data are strongly phylogenetic correlated, and have thus developed a method that combines Hi-C read counts with phylogenetic correlation to predict long-range DRE-target gene relationships. Analysis of predicted DRE-target gene pairs shows that genes regulated by large number of DREs tend to have essential functions, and genes regulated by the same DREs tend to be functionally related and co-expressed. In addition, we show with a couple of examples that the predicted target genes of DREs can help explain the causal roles of disease-associated single-nucleotide polymorphisms located in the DREs. As such, these predictions will be of importance not only for our understanding of the function of DREs but also for elucidating the causal roles of disease-associated noncoding single-nucleotide polymorphisms.

  19. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  20. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  1. PTHGRN: unraveling post-translational hierarchical gene regulatory networks using PPI, ChIP-seq and gene expression data.

    PubMed

    Guan, Daogang; Shao, Jiaofang; Zhao, Zhongying; Wang, Panwen; Qin, Jing; Deng, Youping; Boheler, Kenneth R; Wang, Junwen; Yan, Bin

    2014-07-01

    Interactions among transcriptional factors (TFs), cofactors and other proteins or enzymes can affect transcriptional regulatory capabilities of eukaryotic organisms. Post-translational modifications (PTMs) cooperate with TFs and epigenetic alterations to constitute a hierarchical complexity in transcriptional gene regulation. While clearly implicated in biological processes, our understanding of these complex regulatory mechanisms is still limited and incomplete. Various online software have been proposed for uncovering transcriptional and epigenetic regulatory networks, however, there is a lack of effective web-based software capable of constructing underlying interactive organizations between post-translational and transcriptional regulatory components. Here, we present an open web server, post-translational hierarchical gene regulatory network (PTHGRN) to unravel relationships among PTMs, TFs, epigenetic modifications and gene expression. PTHGRN utilizes a graphical Gaussian model with partial least squares regression-based methodology, and is able to integrate protein-protein interactions, ChIP-seq and gene expression data and to capture essential regulation features behind high-throughput data. The server provides an integrative platform for users to analyze ready-to-use public high-throughput Omics resources or upload their own data for systems biology study. Users can choose various parameters in the method, build network topologies of interests and dissect their associations with biological functions. Application of the software to stem cell and breast cancer demonstrates that it is an effective tool for understanding regulatory mechanisms in biological complex systems. PTHGRN web server is publically available at web site http://www.byanbioinfo.org/pthgrn.

  2. Sequence and analysis of the gene for bacteriophage T3 RNA polymerase.

    PubMed Central

    McGraw, N J; Bailey, J N; Cleaves, G R; Dembinski, D R; Gocke, C R; Joliffe, L K; MacWright, R S; McAllister, W T

    1985-01-01

    The RNA polymerases encoded by bacteriophages T3 and T7 have similar structures, but exhibit nearly exclusive template specificities. We have determined the nucleotide sequence of the region of T3 DNA that encodes the T3 RNA polymerase (the gene 1.0 region), and have compared this sequence with the corresponding region of T7 DNA. The predicted amino acid sequence of the T3 RNA polymerase exhibits very few changes when compared to the T7 enzyme (82% of the residues are identical). Significant differences appear to cluster in three distinct regions in the amino-terminal half of the protein. Analysis of the data from both enzymes suggests features that may be important for polymerase function. In particular, a region that differs between the T3 and T7 enzymes exhibits significant homology to the bi-helical domain that is common to many sequence-specific DNA binding proteins. The region that flanks the structural gene contains a number of regulatory elements including: a promoter for the E. coli RNA polymerase, a potential processing site for RNase III and a promoter for the T3 polymerase. The promoter for the T3 RNA polymerase is located only 12 base pairs distal to the stop codon for the structural gene. PMID:3903658

  3. Identification of the fur-binding site in regulatory region of the vulnibactin-receptor gene in Vibrio vulnificus.

    PubMed

    Lee, Hyun-Jung; Lee, Kyu-Ho

    2012-01-01

    The Vibrio vulnificus vuuA gene, of which expression is repressed by a complex of iron and ferric uptake regulator (Fur), was characterized to localize the Fur-binding site in its upstream regulatory region. In silico analysis suggested the presence of two possible Fur-binding sites; one is a classical Fur-box and the other is a previously reported distinct Fur-binding site. Site-directed mutagenesis and DNase I protection assays revealed the binding site for the iron-Fur complex, which includes an extended inverted repeat containing a homologous sequence to the classical Fur-box.

  4. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  5. Sequence analysis of the oxidase/reductase genes upstream of the Rhodococcus erythropolis aldehyde dehydrogenase gene thcA reveals a gene organisation different from Mycobacterium tuberculosis.

    PubMed

    Nagy, I; De Mot, R

    1999-01-01

    The sequence of the DNA region upstream of the thiocarbamate-inducible aldehyde dehydrogenase gene thcA of Rhodococcus erythropolis NI86/21 was determined. Most of the predicted ORFs are related to various oxidases/reductases, including short-chain oxidases/reductases, GMC oxidoreductases, alpha-hydroxy acid oxidases (subfamily 1 flavin oxidases/dehydrogenases), and subfamily 2 flavin oxidases/dehydrogenases. One ORF is related to enzymes involved in biosynthesis of PQQ or molybdopterin cofactors. In addition, a putative member of the TetR family of regulatory proteins was identified. The substantial sequence divergence from functionally characterized enzymes precludes a reliable prediction about the probable function of these proteins at this stage. In Mycobacterium tuberculosis H37Rv, most of these ORFs have homologs that are also clustered in the genome, but some striking differences in gene organization were observed between Rhodococcus and Mycobacterium.

  6. Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice

    PubMed Central

    2013-01-01

    Background The next big challenge in human genetics is understanding the 98% of the genome that comprises non-coding DNA. Hidden in this DNA are sequences critical for gene regulation, and new experimental strategies are needed to understand the functional role of gene-regulation sequences in health and disease. In this study, we build upon our HuGX ('high-throughput human genes on the X chromosome’) strategy to expand our understanding of human gene regulation in vivo. Results In all, ten human genes known to express in therapeutically important brain regions were chosen for study. For eight of these genes, human bacterial artificial chromosome clones were identified, retrofitted with a reporter, knocked single-copy into the Hprt locus in mouse embryonic stem cells, and mouse strains derived. Five of these human genes expressed in mouse, and all expressed in the adult brain region for which they were chosen. This defined the boundaries of the genomic DNA sufficient for brain expression, and refined our knowledge regarding the complexity of gene regulation. We also characterized for the first time the expression of human MAOA and NR2F2, two genes for which the mouse homologs have been extensively studied in the central nervous system (CNS), and AMOTL1 and NOV, for which roles in CNS have been unclear. Conclusions We have demonstrated the use of the HuGX strategy to functionally delineate non-coding-regulatory regions of therapeutically important human brain genes. Our results also show that a careful investigation, using publicly available resources and bioinformatics, can lead to accurate predictions of gene expression. PMID:24124870

  7. Genetic Variation of Goat Interferon Regulatory Factor 3 Gene and Its Implication in Goat Evolution

    PubMed Central

    Shu, Liping; Zhang, Yesheng; Wang, Yangzi; Sanni, Timothy M.; Imumorin, Ikhide G.; Peters, Sunday O.; Zhang, Jiajin; Dong, Yang; Wang, Wen

    2016-01-01

    The immune systems are fundamentally vital for evolution and survival of species; as such, selection patterns in innate immune loci are of special interest in molecular evolutionary research. The interferon regulatory factor (IRF) gene family control many different aspects of the innate and adaptive immune responses in vertebrates. Among these, IRF3 is known to take active part in very many biological processes. We assembled and evaluated 1356 base pairs of the IRF3 gene coding region in domesticated goats from Africa (Nigeria, Ethiopia and South Africa) and Asia (Iran and China) and the wild goat (Capra aegagrus). Five segregating sites with θ value of 0.0009 for this gene demonstrated a low diversity across the goats’ populations. Fu and Li tests were significantly positive but Tajima’s D test was significantly negative, suggesting its deviation from neutrality. Neighbor joining tree of IRF3 gene in domesticated goats, wild goat and sheep showed that all domesticated goats have a closer relationship than with the wild goat and sheep. Maximum likelihood tree of the gene showed that different domesticated goats share a common ancestor and suggest single origin. Four unique haplotypes were observed across all the sequences, of which, one was particularly common to African goats (MOCH-K14-0425, Poitou and WAD). In assessing the evolution mode of the gene, we found that the codon model dN/dS ratio for all goats was greater than one. Phylogenetic Analysis by Maximum Likelihood (PAML) gave a ω0 (dN/dS) value of 0.067 with LnL value of -6900.3 for the first Model (M1) while ω2 = 1.667 in model M2 with LnL value of -6900.3 with positive selection inferred in 3 codon sites. Mechanistic empirical combination (MEC) model for evaluating adaptive selection pressure on particular codons also confirmed adaptive selection pressure in three codons (207, 358 and 408) in IRF3 gene. Positive diversifying selection inferred with recent evolutionary changes in domesticated goat

  8. Sequence variation in the Tbx4 gene in marine mammals.

    PubMed

    Onbe, Kaori; Nishida, Shin; Sone, Emi; Kanda, Naohisa; Goto, Mutsuo; Pastene, Luis A; Tanabe, Shinsuke; Koike, Hiroko

    2007-05-01

    The amino-acid sequences of the T-domain region of the Tbx4 gene, which is required for hindlimb development, are 100% identical in humans and mice. Cetaceans have lost most of their hindlimb structure, although hindlimb buds are present in very early cetacean embryos. To examine whether the Tbx4 gene has the same function in cetaceans as in other mammals, we analyzed Tbx4 sequences from cetaceans, dugong, artiodactyls and marine carnivores. A total of 39 primers were designed using human and dog Tbx4 nucleotide sequences. Exons 3, 4, 5, 6, 7, and 8 of the Tbx4 genes from cetaceans, artiodactyls, and marine carnivores were sequenced. Non-synonymous substitution sites were detected in the T-domain regions from some cetacean species, but were not detected in those from artiodactyls, the dugong, or the carnivores. The C-terminal regions contained a number of non-synonymous substitutions. Although some indels were present, they were in groups of three nucleotides and therefore did not cause frame shifts. The dN/dS values for the T-domain and C-terminal regions of the cetacean and artiodactylous Tbx4 genes were much lower than 1, indicating that the Tbx4 gene maintains it function in cetaceans, although full expression leading to hindlimb development is suppressed.

  9. A dual cis-regulatory code links IRF8 to constitutive and inducible gene expression in macrophages.

    PubMed

    Mancino, Alessandra; Termanini, Alberto; Barozzi, Iros; Ghisletti, Serena; Ostuni, Renato; Prosperini, Elena; Ozato, Keiko; Natoli, Gioacchino

    2015-02-15

    The transcription factor (TF) interferon regulatory factor 8 (IRF8) controls both developmental and inflammatory stimulus-inducible genes in macrophages, but the mechanisms underlying these two different functions are largely unknown. One possibility is that these different roles are linked to the ability of IRF8 to bind alternative DNA sequences. We found that IRF8 is recruited to distinct sets of DNA consensus sequences before and after lipopolysaccharide (LPS) stimulation. In resting cells, IRF8 was mainly bound to composite sites together with the master regulator of myeloid development PU.1. Basal IRF8-PU.1 binding maintained the expression of a broad panel of genes essential for macrophage functions (such as microbial recognition and response to purines) and contributed to basal expression of many LPS-inducible genes. After LPS stimulation, increased expression of IRF8, other IRFs, and AP-1 family TFs enabled IRF8 binding to thousands of additional regions containing low-affinity multimerized IRF sites and composite IRF-AP-1 sites, which were not premarked by PU.1 and did not contribute to the basal IRF8 cistrome. While constitutively expressed IRF8-dependent genes contained only sites mediating basal IRF8/PU.1 recruitment, inducible IRF8-dependent genes contained variable combinations of constitutive and inducible sites. Overall, these data show at the genome scale how the same TF can be linked to constitutive and inducible gene regulation via distinct combinations of alternative DNA-binding sites.

  10. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  11. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage.

  12. Novel regulatory cascades controlling expression of nitrogen-fixation genes in Geobacter sulfurreducens.

    PubMed

    Ueki, Toshiyuki; Lovley, Derek R

    2010-11-01

    Geobacter species often play an important role in bioremediation of environments contaminated with metals or organics and show promise for harvesting electricity from waste organic matter in microbial fuel cells. The ability of Geobacter species to fix atmospheric nitrogen is an important metabolic feature for these applications. We identified novel regulatory cascades controlling nitrogen-fixation gene expression in Geobacter sulfurreducens. Unlike the regulatory mechanisms known in other nitrogen-fixing microorganisms, nitrogen-fixation gene regulation in G. sulfurreducens is controlled by two two-component His-Asp phosphorelay systems. One of these systems appears to be the master regulatory system that activates transcription of the majority of nitrogen-fixation genes and represses a gene encoding glutamate dehydrogenase during nitrogen fixation. The other system whose expression is directly activated by the master regulatory system appears to control by antitermination the expression of a subset of the nitrogen-fixation genes whose transcription is activated by the master regulatory system and whose promoter contains transcription termination signals. This study provides a new paradigm for nitrogen-fixation gene regulation.

  13. Novel regulatory cascades controlling expression of nitrogen-fixation genes in Geobacter sulfurreducens

    PubMed Central

    Ueki, Toshiyuki; Lovley, Derek R.

    2010-01-01

    Geobacter species often play an important role in bioremediation of environments contaminated with metals or organics and show promise for harvesting electricity from waste organic matter in microbial fuel cells. The ability of Geobacter species to fix atmospheric nitrogen is an important metabolic feature for these applications. We identified novel regulatory cascades controlling nitrogen-fixation gene expression in Geobacter sulfurreducens. Unlike the regulatory mechanisms known in other nitrogen-fixing microorganisms, nitrogen-fixation gene regulation in G. sulfurreducens is controlled by two two-component His–Asp phosphorelay systems. One of these systems appears to be the master regulatory system that activates transcription of the majority of nitrogen-fixation genes and represses a gene encoding glutamate dehydrogenase during nitrogen fixation. The other system whose expression is directly activated by the master regulatory system appears to control by antitermination the expression of a subset of the nitrogen-fixation genes whose transcription is activated by the master regulatory system and whose promoter contains transcription termination signals. This study provides a new paradigm for nitrogen-fixation gene regulation. PMID:20660485

  14. LmSmdB: an integrated database for metabolic and gene regulatory network in Leishmania major and Schistosoma mansoni.

    PubMed

    Patel, Priyanka; Mandlik, Vineetha; Singh, Shailza

    2016-03-01

    A database that integrates all the information required for biological processing is essential to be stored in one platform. We have attempted to create one such integrated database that can be a one stop shop for the essential features required to fetch valuable result. LmSmdB (L. major and S. mansoni database) is an integrated database that accounts for the biological networks and regulatory pathways computationally determined by integrating the knowledge of the genome sequences of the mentioned organisms. It is the first database of its kind that has together with the network designing showed the simulation pattern of the product. This database intends to create a comprehensive canopy for the regulation of lipid metabolism reaction in the para