DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.
2003-06-01
OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
2003-12-31
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.
2002-01-01
Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less
van der Meulen, Sjoerd B; de Jong, Anne; Kok, Jan
2016-01-01
RNA sequencing has revolutionized genome-wide transcriptome analyses, and the identification of non-coding regulatory RNAs in bacteria has thus increased concurrently. Here we reveal the transcriptome map of the lactic acid bacterial paradigm Lactococcus lactis MG1363 by employing differential RNA sequencing (dRNA-seq) and a combination of manual and automated transcriptome mining. This resulted in a high-resolution genome annotation of L. lactis and the identification of 60 cis-encoded antisense RNAs (asRNAs), 186 trans-encoded putative regulatory RNAs (sRNAs) and 134 novel small ORFs. Based on the putative targets of asRNAs, a novel classification is proposed. Several transcription factor DNA binding motifs were identified in the promoter sequences of (a)sRNAs, providing insight in the interplay between lactococcal regulatory RNAs and transcription factors. The presence and lengths of 14 putative sRNAs were experimentally confirmed by differential Northern hybridization, including the abundant RNA 6S that is differentially expressed depending on the available carbon source. For another sRNA, LLMGnc_147, functional analysis revealed that it is involved in carbon uptake and metabolism. L. lactis contains 13% leaderless mRNAs (lmRNAs) that, from an analysis of overrepresentation in GO classes, seem predominantly involved in nucleotide metabolism and DNA/RNA binding. Moreover, an A-rich sequence motif immediately following the start codon was uncovered, which could provide novel insight in the translation of lmRNAs. Altogether, this first experimental genome-wide assessment of the transcriptome landscape of L. lactis and subsequent sRNA studies provide an extensive basis for the investigation of regulatory RNAs in L. lactis and related lactococcal species.
Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong
2016-01-01
Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230
Vandesteene, Lies; Ramon, Matthew; Le Roy, Katrien; Van Dijck, Patrick; Rolland, Filip
2010-03-01
Higher plants typically do not produce trehalose in large amounts, but their genome sequences reveal large families of putative trehalose metabolism enzymes. An important regulatory role in plant growth and development is also emerging for the metabolic intermediate trehalose-6-P (T6P). Here, we present an update on Arabidopsis trehalose metabolism and a resource for further detailed analyses. In addition, we provide evidence that Arabidopsis encodes a single trehalose-6-P synthase (TPS) next to a family of catalytically inactive TPS-like proteins that might fulfill specific regulatory functions in actively growing tissues.
RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.
Brule, C E; Dean, K M; Grayhack, E J
2016-01-01
The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.
Bai, Wen L; Zhao, Su J; Wang, Ze Y; Zhu, Yu B; Dang, Yun L; Cong, Yu Y; Xue, Hui L; Wang, Wei; Deng, Liang; Guo, Dan; Wang, Shi Q; Zhu, Yan X; Yin, Rong H
2018-07-03
Long noncoding RNAs (lncRNAs) are a novel class of eukaryotic transcripts. They are thought to act as a critical regulator of protein-coding gene expression. Herein, we identified and characterized 13 putative lncRNAs from the expressed sequence tags from secondary hair follicle of Cashmere goat. Furthermore, we investigated their transcriptional pattern in secondary hair follicle of Liaoning Cashmere goat during telogen and anagen phases. Also, we generated intracellular regulatory networks of upregulated lncRNAs at anagen in Wnt signaling pathway based on bioinformatics analysis. The relative expression of six putative lncRNAs (lncRNA-599618, -599556, -599554, -599547, -599531, and -599509) at the anagen phase is significantly higher than that at telogen. Compared with anagen, the relative expression of four putative lncRNAs (lncRNA-599528, -599518, -599511, and -599497) was found to be significantly upregulated at telogen phase. The network generated showed that a rich and complex regulatory relationship of the putative lncRNAs and related miRNAs with their target genes in Wnt signaling pathway. Our results from the present study provided a foundation for further elucidating the functional and regulatory mechanisms of these putative lncRNAs in the development of secondary hair follicle and cashmere fiber growth of Cashmere goat.
Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong
2016-10-01
Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Zhang, Xiaodong; Allan, Andrew C.; Li, Caixia; Wang, Yuanzhong; Yao, Qiuyang
2015-01-01
Gentiana rigescens is an important medicinal herb in China. The main validated medicinal component gentiopicroside is synthesized in shoots, but is mainly found in the plant’s roots. The gentiopicroside biosynthetic pathway and its regulatory control remain to be elucidated. Genome resources of gentian are limited. Next-generation sequencing (NGS) technologies can aid in supplying global gene expression profiles. In this study we present sequence and transcript abundance data for the root and leaf transcriptome of G. rigescens, obtained using the Illumina Hiseq2000. Over fifty million clean reads were obtained from leaf and root libraries. This yields 76,717 unigenes with an average length of 753 bp. Among these, 33,855 unigenes were identified as putative homologs of annotated sequences in public protein and nucleotide databases. Digital abundance analysis identified 3306 unigenes differentially enriched between leaf and root. Unigenes found in both tissues were categorized according to their putative functional categories. Of the differentially expressed genes, over 130 were annotated as related to terpenoid biosynthesis. This work is the first study of global transcriptome analyses in gentian. These sequences and putative functional data comprise a resource for future investigation of terpenoid biosynthesis in Gentianaceae species and annotation of the gentiopicroside biosynthetic pathway and its regulatory mechanisms. PMID:26006235
Genomic dissection of conserved transcriptional regulation in intestinal epithelial cells
Camp, J. Gray; Weiser, Matthew; Cocchiaro, Jordan L.; Kingsley, David M.; Furey, Terrence S.; Sheikh, Shehzad Z.; Rawls, John F.
2017-01-01
The intestinal epithelium serves critical physiologic functions that are shared among all vertebrates. However, it is unknown how the transcriptional regulatory mechanisms underlying these functions have changed over the course of vertebrate evolution. We generated genome-wide mRNA and accessible chromatin data from adult intestinal epithelial cells (IECs) in zebrafish, stickleback, mouse, and human species to determine if conserved IEC functions are achieved through common transcriptional regulation. We found evidence for substantial common regulation and conservation of gene expression regionally along the length of the intestine from fish to mammals and identified a core set of genes comprising a vertebrate IEC signature. We also identified transcriptional start sites and other putative regulatory regions that are differentially accessible in IECs in all 4 species. Although these sites rarely showed sequence conservation from fish to mammals, surprisingly, they drove highly conserved IEC expression in a zebrafish reporter assay. Common putative transcription factor binding sites (TFBS) found at these sites in multiple species indicate that sequence conservation alone is insufficient to identify much of the functionally conserved IEC regulatory information. Among the rare, highly sequence-conserved, IEC-specific regulatory regions, we discovered an ancient enhancer upstream from her6/HES1 that is active in a distinct population of Notch-positive cells in the intestinal epithelium. Together, these results show how combining accessible chromatin and mRNA datasets with TFBS prediction and in vivo reporter assays can reveal tissue-specific regulatory information conserved across 420 million years of vertebrate evolution. We define an IEC transcriptional regulatory network that is shared between fish and mammals and establish an experimental platform for studying how evolutionarily distilled regulatory information commonly controls IEC development and physiology. PMID:28850571
Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya
2015-01-01
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Turatsinze, Jean-Valery; Thomas-Chollier, Morgane; Defrance, Matthieu; van Helden, Jacques
2008-01-01
This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.
Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman
2016-11-02
Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cis-regulatory landscapes of four cell types of the retina
Hartl, Dominik; Jüttner, Josephine
2017-01-01
Abstract The retina is composed of ∼50 cell-types with specific functions for the process of vision. Identification of the cis-regulatory elements active in retinal cell-types is key to elucidate the networks controlling this diversity. Here, we combined transcriptome and epigenome profiling to map the regulatory landscape of four cell-types isolated from mouse retinas including rod and cone photoreceptors as well as rare inter-neuron populations such as horizontal and starburst amacrine cells. Integration of this information reveals sequence determinants and candidate transcription factors for controlling cellular specialization. Additionally, we refined parallel reporter assays to enable studying the transcriptional activity of large collection of sequences in individual cell-types isolated from a tissue. We provide proof of concept for this approach and its scalability by characterizing the transcriptional capacity of several hundred putative regulatory sequences within individual retinal cell-types. This generates a catalogue of cis-regulatory regions active in retinal cell types and we further demonstrate their utility as potential resource for cellular tagging and manipulation. PMID:29059322
Karakülah, Gökhan
2017-06-28
Novel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.
Patel, Hardip; Forêt, Sylvain; Karlsen, Bård Ove; Jørgensen, Tor Erik; Hall-Spencer, Jason M
2018-01-01
Abstract Cnidarians harbor a variety of small regulatory RNAs that include microRNAs (miRNAs) and PIWI-interacting RNAs (piRNAs), but detailed information is limited. Here, we report the identification and expression of novel miRNAs and putative piRNAs, as well as their genomic loci, in the symbiotic sea anemone Anemonia viridis. We generated a draft assembly of the A. viridis genome with putative size of 313 Mb that appeared to be composed of about 36% repeats, including known transposable elements. We detected approximately equal fractions of DNA transposons and retrotransposons. Deep sequencing of small RNA libraries constructed from A. viridis adults sampled at a natural CO2 gradient off Vulcano Island, Italy, identified 70 distinct miRNAs. Eight were homologous to previously reported miRNAs in cnidarians, whereas 62 appeared novel. Nine miRNAs were recognized as differentially expressed along the natural seawater pH gradient. We found a highly abundant and diverse population of piRNAs, with a substantial fraction showing ping–pong signatures. We identified nearly 22% putative piRNAs potentially targeting transposable elements within the A. viridis genome. The A. viridis genome appeared similar in size to that of other hexacorals with a very high divergence of transposable elements resembling that of the sea anemone genus Exaiptasia. The genome encodes and expresses a high number of small regulatory RNAs, which include novel miRNAs and piRNAs. Differentially expressed small RNAs along the seawater pH gradient indicated regulatory gene responses to environmental stressors. PMID:29385567
Antonini, S R; N'Diaye, N; Baldacchino, V; Hamet, P; Tremblay, J; Lacroix, A
2004-07-01
Gastric inhibitory polypeptide (GIP)-dependent Cushing's syndrome (CS) results from the ectopic expression of non-mutated GIP receptor (hGIPR) in the adrenal cortex. We evaluated whether mutations or polymorphisms in the regulatory region of the GIPR gene could lead to this aberrant expression. We studied 9.0kb upstream and 1.3kb downstream of the GIPR gene putative promoter (pProm) by sequencing leukocyte DNA from controls and from adrenal tissues of GIP- and non-GIP-dependent CS patients. The putative proximal promoter region (800 bp) and the first exon and intron of the hGIPR gene were sequenced on adrenal DNA from nine GIP-dependent CS, as well as on leukocyte DNA of nine normal controls. Three variations found in this region were found in all patients and controls; at position -4/-5, an insertion of a T was seen in four out of nine patients and in five out of nine controls. Transient transfection studies conducted in rat GC and mouse Y1 cells showed that the TT allele confers loss of 40% in the promoter activity. The analysis of the 8-kb distal pProm region revealed eight distal single nucleotide polymorphisms (SNPs) without probable association with the disease, since frequencies in patients and controls were very similar. In conclusion, mutations or SNPs in the regulatory region of the GIPR gene are unlikely to underlie GIP-dependent CS. Copyright 2004 Elsevier Ltd.
Transcriptional regulation of podoplanin expression by Prox1 in lymphatic endothelial cells.
Pan, Yanfang; Wang, Wen-di; Yago, Tadayuki
2014-07-01
Transcription factor prospero homeobox 1 (Prox-1) and podoplanin (PDPN), mucin-type transmembane protein, are both constantly expressed in lymphatic endothelial cells (LECs) and appear to function in an LEC-autonomous manner. Mice globally lacking PDPN (Pdpn(-/-)) develop abnormal and blood-filled lymphatic vessels that highly resemble those in inducible mice lacking Prox-1 (Prox1(-/-)). Prox1 has also been reported to induce PDPN expression in cultured ECs. Thus, we hypothesize that PDPN functions downstream of Prox1 and that its expression is regulated by Prox1 in LECs at the transcriptional level. We first identified four putative binding elements for Prox1 in the 5' upstream regulatory region of Pdpn gene and found that Prox1 directly binds to the 5' regulatory sequence of Pdpn gene in LECs by chromatin immunoprecipitation assay. DNA pull down assay confirmed that Prox1 binds to the putative binding element. In addition, luciferase reporter assay indicated that Prox1 binding to the 5' regulatory sequence of Pdpn regulates Pdpn gene expression. We are therefore the first to experimentally demonstrate that Prox1 regulates PDPN expression at the transcriptional level in the lymphatic vascular system. Copyright © 2014 Elsevier Inc. All rights reserved.
Cis-regulatory landscapes of four cell types of the retina.
Hartl, Dominik; Krebs, Arnaud R; Jüttner, Josephine; Roska, Botond; Schübeler, Dirk
2017-11-16
The retina is composed of ∼50 cell-types with specific functions for the process of vision. Identification of the cis-regulatory elements active in retinal cell-types is key to elucidate the networks controlling this diversity. Here, we combined transcriptome and epigenome profiling to map the regulatory landscape of four cell-types isolated from mouse retinas including rod and cone photoreceptors as well as rare inter-neuron populations such as horizontal and starburst amacrine cells. Integration of this information reveals sequence determinants and candidate transcription factors for controlling cellular specialization. Additionally, we refined parallel reporter assays to enable studying the transcriptional activity of large collection of sequences in individual cell-types isolated from a tissue. We provide proof of concept for this approach and its scalability by characterizing the transcriptional capacity of several hundred putative regulatory sequences within individual retinal cell-types. This generates a catalogue of cis-regulatory regions active in retinal cell types and we further demonstrate their utility as potential resource for cellular tagging and manipulation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Perkins, J B; Bower, S; Howitt, C L; Yocum, R R; Pero, J
1996-01-01
Northern (RNA) blot analysis of the Bacillus subtilis biotin operon, bioWAFDBIorf2, detected at least two steady-state polycistronic transcripts initiated from a putative vegetative (Pbio) promoter that precedes the operon, i.e., a full-length 7.2-kb transcript covering the entire operon and a more abundant 5.1-kb transcript covering just the first five genes of the operon. Biotin and the B. subtilis birA gene product regulated synthesis of the transcripts. Moreover, replacing the putative Pbio promoter and regulatory sequence with a constitutive SP01 phage promoter resulted in higher-level constitutive synthesis. Removal of a rho-independent terminator-like sequence located between the fifth (bioB) and sixth (bioI) genes prevented accumulation of the 5.1-kb transcript, suggesting that the putative terminator functions to limit expression of bioI, which is thought to be involved in an early step in biotin synthesis. PMID:8892842
Perkins, J B; Bower, S; Howitt, C L; Yocum, R R; Pero, J
1996-11-01
Northern (RNA) blot analysis of the Bacillus subtilis biotin operon, bioWAFDBIorf2, detected at least two steady-state polycistronic transcripts initiated from a putative vegetative (Pbio) promoter that precedes the operon, i.e., a full-length 7.2-kb transcript covering the entire operon and a more abundant 5.1-kb transcript covering just the first five genes of the operon. Biotin and the B. subtilis birA gene product regulated synthesis of the transcripts. Moreover, replacing the putative Pbio promoter and regulatory sequence with a constitutive SP01 phage promoter resulted in higher-level constitutive synthesis. Removal of a rho-independent terminator-like sequence located between the fifth (bioB) and sixth (bioI) genes prevented accumulation of the 5.1-kb transcript, suggesting that the putative terminator functions to limit expression of bioI, which is thought to be involved in an early step in biotin synthesis.
Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay
2015-10-15
MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-12-01
The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-01-01
Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488
USDA-ARS?s Scientific Manuscript database
A transient in vivo P element excision assay was used to test the regulatory properties of putative repressor-encoding plasmids in Drosophila melanogaster embryos. The somatic expression of an unmodified transposase transcription unit under the control of a heat shock gene promoter (phsn) effectivel...
Fang, Weiguo; Leng, Bo; Xiao, Yuehua; Jin, Kai; Ma, Jincheng; Fan, Yanhua; Feng, Jing; Yang, Xingyong; Zhang, Yongjun; Pei, Yan
2005-01-01
Entomopathogenic fungi can produce a series of chitinases, some of which act synergistically with proteases to degrade insect cuticle. However, chitinase involvement in insect fungus pathogenesis has not been fully characterized. In this paper, an endochitinase, Bbchit1, was purified to homogeneity from liquid cultures of Beauveria bassiana grown in a medium containing colloidal chitin. Bbchit1 had a molecular mass of about 33 kDa and pI of 5.4. Based on the N-terminal amino acid sequence, the chitinase gene, Bbchit1, and its upstream regulatory sequence were cloned. Bbchit1 was intronless, and there was a single copy in B. bassiana. Its regulatory sequence contained putative CreA/Crel carbon catabolic repressor binding domains, which was consistent with glucose suppression of Bbchit1. At the amino acid level, Bbchit1 showed significant similarity to a Streptomyces avermitilis putative endochitinase, a Streptomyces coelicolor putative chitinase, and Trichoderma harzianum endochitinase Chit36Y. However, Bbchit1 had very low levels of identity to other chitinase genes previously isolated from entomopathogenic fungi, indicating that Bbchit1 was a novel chitinase gene from an insect-pathogenic fungus. A gpd-Bbchit1 construct, in which Bbchit1 was driven by the Aspergiullus nidulans constitutive promoter, was transformed into the genome of B. bassiana, and three transformants that overproduced Bbchit1 were obtained. Insect bioassays revealed that overproduction of Bbchit1 enhanced the virulence of B. bassiana for aphids, as indicated by significantly lower 50% lethal concentrations and 50% lethal times of the transformants compared to the values for the wild-type strain.
Fang, Weiguo; Leng, Bo; Xiao, Yuehua; Jin, Kai; Ma, Jincheng; Fan, Yanhua; Feng, Jing; Yang, Xingyong; Zhang, Yongjun; Pei, Yan
2005-01-01
Entomopathogenic fungi can produce a series of chitinases, some of which act synergistically with proteases to degrade insect cuticle. However, chitinase involvement in insect fungus pathogenesis has not been fully characterized. In this paper, an endochitinase, Bbchit1, was purified to homogeneity from liquid cultures of Beauveria bassiana grown in a medium containing colloidal chitin. Bbchit1 had a molecular mass of about 33 kDa and pI of 5.4. Based on the N-terminal amino acid sequence, the chitinase gene, Bbchit1, and its upstream regulatory sequence were cloned. Bbchit1 was intronless, and there was a single copy in B. bassiana. Its regulatory sequence contained putative CreA/Crel carbon catabolic repressor binding domains, which was consistent with glucose suppression of Bbchit1. At the amino acid level, Bbchit1 showed significant similarity to a Streptomyces avermitilis putative endochitinase, a Streptomyces coelicolor putative chitinase, and Trichoderma harzianum endochitinase Chit36Y. However, Bbchit1 had very low levels of identity to other chitinase genes previously isolated from entomopathogenic fungi, indicating that Bbchit1 was a novel chitinase gene from an insect-pathogenic fungus. A gpd-Bbchit1 construct, in which Bbchit1 was driven by the Aspergiullus nidulans constitutive promoter, was transformed into the genome of B. bassiana, and three transformants that overproduced Bbchit1 were obtained. Insect bioassays revealed that overproduction of Bbchit1 enhanced the virulence of B. bassiana for aphids, as indicated by significantly lower 50% lethal concentrations and 50% lethal times of the transformants compared to the values for the wild-type strain. PMID:15640210
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Parallel evolution of chordate cis-regulatory code for development.
Doglio, Laura; Goode, Debbie K; Pelleri, Maria C; Pauls, Stefan; Frabetti, Flavia; Shimeld, Sebastian M; Vavouri, Tanya; Elgar, Greg
2013-11-01
Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.
Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function
Noble, Luke M.; Andrianopoulos, Alex
2013-01-01
Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226
Bhatia, Shipra; Gordon, Christopher T.; Foster, Robert G.; Melin, Lucie; Abadie, Véronique; Baujat, Geneviève; Vazquez, Marie-Paule; Amiel, Jeanne; Lyonnet, Stanislas; van Heyningen, Veronica; Kleinjan, Dirk A.
2015-01-01
Disruption of gene regulation by sequence variation in non-coding regions of the genome is now recognised as a significant cause of human disease and disease susceptibility. Sequence variants in cis-regulatory elements (CREs), the primary determinants of spatio-temporal gene regulation, can alter transcription factor binding sites. While technological advances have led to easy identification of disease-associated CRE variants, robust methods for discerning functional CRE variants from background variation are lacking. Here we describe an efficient dual-colour reporter transgenesis approach in zebrafish, simultaneously allowing detailed in vivo comparison of spatio-temporal differences in regulatory activity between putative CRE variants and assessment of altered transcription factor binding potential of the variant. We validate the method on known disease-associated elements regulating SHH, PAX6 and IRF6 and subsequently characterise novel, ultra-long-range SOX9 enhancers implicated in the craniofacial abnormality Pierre Robin Sequence. The method provides a highly cost-effective, fast and robust approach for simultaneously unravelling in a single assay whether, where and when in embryonic development a disease-associated CRE-variant is affecting its regulatory function. PMID:26030420
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.
Borodovsky, M; Rudd, K E; Koonin, E V
1994-01-01
The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272
Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Tenebrio molitor antifreeze protein gene identification and regulation.
Qin, Wensheng; Walker, Virginia K
2006-02-15
The yellow mealworm, Tenebrio molitor, is a freeze susceptible, stored product pest. Its winter survival is facilitated by the accumulation of antifreeze proteins (AFPs), encoded by a small gene family. We have now isolated 11 different AFP genomic clones from 3 genomic libraries. All the clones had a single coding sequence, with no evidence of intervening sequences. Three genomic clones were further characterized. All have putative TATA box sequences upstream of the coding regions and multiple potential poly(A) signal sequences downstream of the coding regions. A TmAFP regulatory region, B1037, conferred transcriptional activity when ligated to a luciferase reporter sequence and after transfection into an insect cell line. A 143 bp core promoter including a TATA box sequence was identified. Its promoter activity was increased 4.4 times by inserting an exotic 245 bp intron into the construct, similar to the enhancement of transgenic expression seen in several other systems. The addition of a duplication of the first 120 bp sequence from the 143 bp core promoter decreased promoter activity by half. Although putative hormonal response sequences were identified, none of the five hormones tested enhanced reporter activity. These studies on the mechanisms of AFP transcriptional control are important for the consideration of any transfer of freeze-resistance phenotypes to beneficial hosts.
Qin, Jin-Hong; Zhang, Qing; Zhang, Zhi-Ming; Zhong, Yi; Yang, Yang; Hu, Bao-Yu; Zhao, Guo-Ping; Guo, Xiao-Kui
2008-06-01
DNA microarray analysis was used to compare the differential gene expression profiles between Leptospira interrogans serovar Lai type strain 56601 and its corresponding attenuated strain IPAV. A 22-kb genomic island covering a cluster of 34 genes (i.e., genes LA0186 to LA0219) was actively expressed in both strains but concomitantly upregulated in strain 56601 in contrast to that of IPAV. Reverse transcription-PCR assays proved that the gene cluster comprised five transcripts. Gene annotation of this cluster revealed characteristics of a putative prophage-like remnant with at least 8 of 34 sequences encoding prophage-like proteins, of which the LA0195 protein is probably a putative prophage CI-like regulator. The transcription initiation activities of putative promoter-regulatory sequences of transcripts I, II, and III, all proximal to the LA0195 gene, were further analyzed in the Escherichia coli promoter probe vector pKK232-8 by assaying the reporter chloramphenicol acetyltransferase (CAT) activities. The strong promoter activities of both transcripts I and II indicated by the E. coli CAT assay were well correlated with the in vitro sequence-specific binding of the recombinant LA0195 protein to the corresponding promoter probes detected by the electrophoresis mobility shift assay. On the other hand, the promoter activity of transcript III was very low in E. coli and failed to show active binding to the LA0195 protein in vitro. These results suggested that the LA0195 protein is likely involved in the transcription of transcripts I and II. However, the identical complete DNA sequences of this prophage remnant from these two strains strongly suggests that possible regulatory factors or signal transduction systems residing outside of this region within the genome may be responsible for the differential expression profiling in these two strains.
Prevalence of transcription promoters within archaeal operons and coding sequences
Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S
2009-01-01
Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of ∼64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein–DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3′ ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes—events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements. PMID:19536208
Prevalence of transcription promoters within archaeal operons and coding sequences.
Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S
2009-01-01
Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.
Molin, William T; Wright, Alice A; Lawton-Rauh, Amy; Saski, Christopher A
2017-01-17
The expanding number and global distributions of herbicide resistant weedy species threaten food, fuel, fiber and bioproduct sustainability and agroecosystem longevity. Amongst the most competitive weeds, Amaranthus palmeri S. Wats has rapidly evolved resistance to glyphosate primarily through massive amplification and insertion of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene across the genome. Increased EPSPS gene copy numbers results in higher titers of the EPSPS enzyme, the target of glyphosate, and confers resistance to glyphosate treatment. To understand the genomic unit and mechanism of EPSPS gene copy number proliferation, we developed and used a bacterial artificial chromosome (BAC) library from a highly resistant biotype to sequence the local genomic landscape flanking the EPSPS gene. By sequencing overlapping BACs, a 297 kb sequence was generated, hereafter referred to as the "EPSPS cassette." This region included several putative genes, dense clusters of tandem and inverted repeats, putative helitron and autonomous replication sequences, and regulatory elements. Whole genome shotgun sequencing (WGS) of two biotypes exhibiting high and no resistance to glyphosate was performed to compare genomic representation across the EPSPS cassette. Mapping of sequences for both biotypes to the reference EPSPS cassette revealed significant differences in upstream and downstream sequences relative to EPSPS with regard to both repetitive units and coding content between these biotypes. The differences in sequence may have resulted from a compounded-building mechanism such as repetitive transpositional events. The association of putative helitron sequences with the cassette suggests a possible amplification and distribution mechanism. Flow cytometry revealed that the EPSPS cassette added measurable genomic content. The adoption of glyphosate resistant cropping systems in major crops such as corn, soybean, cotton and canola coupled with excessive use of glyphosate herbicide has led to evolved glyphosate resistance in several important weeds. In Amaranthus palmeri, the amplification of the EPSPS cassette, characterized by a complex array of repetitive elements and putative helitron sequences, suggests an adaptive structural genomic mechanism that drives amplification and distribution around the genome. The added genomic content not found in glyphosate sensitive plants may be driving evolution through genome expansion.
Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.
de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J
2002-09-01
The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.
Sand, Olivier; Thomas-Chollier, Morgane; Vervisch, Eric; van Helden, Jacques
2008-01-01
This protocol shows how to access the Regulatory Sequence Analysis Tools (RSAT) via a programmatic interface in order to automate the analysis of multiple data sets. We describe the steps for writing a Perl client that connects to the RSAT Web services and implements a workflow to discover putative cis-acting elements in promoters of gene clusters. In the presented example, we apply this workflow to lists of transcription factor target genes resulting from ChIP-chip experiments. For each factor, the protocol predicts the binding motifs by detecting significantly overrepresented hexanucleotides in the target promoters and generates a feature map that displays the positions of putative binding sites along the promoter sequences. This protocol is addressed to bioinformaticians and biologists with programming skills (notions of Perl). Running time is approximately 6 min on the example data set.
[Divergence of paralogous growth-hormone-encoding genes and their promoters in Salmonidae].
Kamenskaya, D N; Pankova, M V; Atopkin, D M; Brykov, V A
2017-01-01
In many fish species, including salmonids, the growth-hormone is encoded by two duplicated paralogous genes, gh1 and gh2. Both genes were already in place at the time of divergence of species in this group. A comparison of the entire sequence of these genes of salmonids has shown that their conserved regions are associated with exons, while their most variable regions correspond to introns. Introns C and D include putative regulatory elements (sites Pit-1, CRE, and ERE), that are also conserved. In chars, the degree of polymorphism of gh2 gene is 2-3 times as large as that in gh1 gene. However, a comparison across all Salmonidae species would not extent this observation to other species. In both these chars' genes, the promoters are conserved mainly because they correspond to putative regulatory sequences (TATA box, binding sites for the pituitary transcription factor Pit-1 (F1-F4), CRE, GRE and RAR/RXR elements). The promoter of gh2 gene has a greater degree of polymorphism compared with gh1 gene promoter in all investigated species of salmonids. The observed differences in the rates of accumulation of changes in growth hormone encoding paralogs could be explained by differences in the intensity of selection.
Cho, Min Seok; Joh, Kiseong; Ahn, Tae-Young; Park, Dong Suk
2014-09-01
Escherichia coli serotype O157 is still a major global healthcare problem. However, only limited information is now available on the molecular and serological detection of pathogenic bacteria. Therefore, the development of appropriate strategies for their rapid identification and monitoring is still needed. In general, the sequence analysis based on stx, slt, eae, hlyA, rfb, and fliCh7 genes is widely employed for the identification of E. coli serotype O157; but there have been critical defects in the diagnosis and identification of E. coli serotype O157, in that they are also present in other E. coli serogroups. In this study, NCBI-BLAST searches using the nucleotide sequences of the putative regulatory protein gene from E. coli O157:H7 str. Sakai found sequence difference at the serotype level. The specific primers from the putative regulatory protein gene were designed and investigated for their sensitivity and specificity for detecting the pathogen in environment water samples. The specificity of the primer set was evaluated using genomic DNA from 8 isolates of E. coli serotype O157 and 32 other reference strains. In addition, the sensitivity and specificity of this assay were confirmed by successful identification of E. coli serotype O157 in environmental water samples. In conclusion, this study showed that the newly developed quantitative serotype-specific PCR method is a highly specific and efficient tool for the surveillance and rapid detection of high-risk E. coli serotype O157.
Fritz, David T; Jiang, Shan; Xu, Junwang; Rogers, Melissa B
2006-07-01
The bone morphogenetic protein (BMP)2 gene has been genetically linked to osteoporosis and osteoarthritis. We have shown that the 3'-untranslated regions (UTR) of BMP2 genes from mammals to fishes are extraordinarily conserved. This indicates that the BMP2 3'-UTR is under stringent selective pressure. We present evidence that the conserved region is a strong posttranscriptional regulator of BMP2 expression. Polymorphisms in cis-regulatory elements have been proven to influence susceptibility to a growing number of diseases. A common single nucleotide polymorphism (SNP) disrupts a putative posttranscriptional regulatory motif, an AU-rich element, within the BMP2 3'-UTR. The affinity of specific proteins for the rs15705 SNP sequence differs from their affinity for the normal human sequence. More importantly, the in vitro decay rate of RNAs with the SNP is higher than that of RNAs with the normal sequence. Such changes in mRNA:protein interactions may influence the posttranscriptional mechanisms that control BMP2 gene expression. The consequent alterations in BMP2 protein levels may influence the development or physiology of bone or other BMP2-influenced tissues.
Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride
2009-01-01
To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368
Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue
2013-01-01
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong
2010-02-19
Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less
2009-01-01
Background The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second putative succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae. PMID:19473543
Bussemaker, Harmen J.; Li, Hao; Siggia, Eric D.
2000-01-01
The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners. PMID:10944202
Network perturbation by recurrent regulatory variants in cancer
Cho, Ara; Lee, Insuk; Choi, Jung Kyoon
2017-01-01
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
Casimiro-Soriguer, Inés; Narbona, Eduardo; Buide, M. L.; del Valle, José C.; Whittall, Justen B.
2016-01-01
Flower color polymorphisms are widely used as model traits from genetics to ecology, yet determining the biochemical and molecular basis can be challenging. Anthocyanin-based flower color variations can be caused by at least 12 structural and three regulatory genes in the anthocyanin biosynthetic pathway (ABP). We use mRNA-Seq to simultaneously sequence and estimate expression of these candidate genes in nine samples of Silene littorea representing three color morphs (dark pink, light pink and white) across three developmental stages in hopes of identifying the cause of flower color variation. We identified 29 putative paralogs for the 15 candidate genes in the ABP. We assembled complete coding sequences for 16 structural loci and nine of ten regulatory loci. Among these 29 putative paralogs, we identified 622 SNPs, yet only nine synonymous SNPs in Ans had allele frequencies that differentiated pigmented petals (dark pink and light pink) from white petals. These Ans allele frequency differences were further investigated with an expanded sequencing survey of 38 individuals, yet no SNPs consistently differentiated the color morphs. We also found one locus, F3h1, with strong differential expression between pigmented and white samples (>42x). This may be caused by decreased expression of Myb1a in white petal buds. Myb1a in S. littorea is a regulatory locus closely related to Subgroup 7 Mybs known to regulate F3h and other loci in the first half of the ABP in model species. We then compare the mRNA-Seq results with petal biochemistry which revealed cyanidin as the primary anthocyanin and five flavonoid intermediates. Concentrations of three of the flavonoid intermediates were significantly lower in white petals than in pigmented petals (rutin, quercetin and isovitexin). The biochemistry results for rutin, quercetin, luteolin and apigenin are consistent with the transcriptome results suggesting a blockage at F3h, possibly caused by downregulation of Myb1a. PMID:26973662
Characterization of noncoding regulatory DNA in the human genome.
Elkon, Ran; Agami, Reuven
2017-08-08
Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.
Falaleeva, Marina; Zurek, Oliwia W.; Watkins, Robert L.; Reed, Robert W.; Ali, Hadeel; Sumby, Paul; Voyich, Jovanka M.
2014-01-01
The important human pathogen Streptococcus pyogenes (group A Streptococcus [GAS]) produces a hyaluronic acid (HA) capsule that plays critical roles in immune evasion. Previous studies showed that the hasABC operon encoding the capsule biosynthesis enzymes is under the control of a single promoter, P1, which is negatively regulated by the two-component regulatory system CovR/S. In this work, we characterize the sequence upstream of P1 and identify a novel regulatory region controlling transcription of the capsule biosynthesis operon in the M1 serotype strain MGAS2221. This region consists of a promoter, P2, which initiates transcription of a novel small RNA, HasS, an intrinsic transcriptional terminator that inefficiently terminates HasS, permitting read-through transcription of hasABC, and a putative promoter which lies upstream of P2. Electrophoretic mobility shift assays, quantitative reverse transcription-PCR, and transcriptional reporter data identified CovR as a negative regulator of P2. We found that the P1 and P2 promoters are completely repressed by CovR, and capsule expression is regulated by the putative promoter upstream of P2. Deletion of hasS or of the terminator eliminates CovR-binding sequences, relieving repression and increasing read-through, hasA transcription, and capsule production. Sequence analysis of 44 GAS genomes revealed a high level of polymorphism in the HasS sequence region. Most of the HasS variations were located in the terminator sequences, suggesting that this region is under strong selective pressure. We discovered that the terminator deletion mutant is highly resistant to neutrophil-mediated killing and is significantly more virulent in a mouse model of GAS invasive disease than the wild-type strain. Together, these results are consistent with the naturally occurring mutations in this region modulating GAS virulence. PMID:25287924
Itoh, S; Yanagimoto, T; Tagawa, S; Hashimoto, H; Kitamura, R; Nakajima, Y; Okochi, T; Fujimoto, S; Uchino, J; Kamataki, T
1992-03-24
P-450IIIA7 is a form of cytochrome P-450 which was isolated from human fetal livers and termed P-450HFLa. This form has been clarified to be expressed during fetal life specifically (Komori, M., Nishio, K., Kitada, M., Shiramatsu, K., Muroya, K., Soma, M., Nagashima, K. and Kamataki, T. (1990) Biochemistry 29, 4430-4433). In the present study, we isolated five independent clones which probably corresponded to the human P-450IIIA7 gene. These clones were completely sequenced, all exons, exon-intron junctions and the 5' flanking region from the cap site to-869. Although the sequences in the coding region were completely identical to P-450IIIA7, it is possible that genomic fragments sequenced in this study encode portions of other P-450IIIA7-related genes since we could not obtain a complete overlapping set of genomic clones. Within its 5' flanking sequence, the putative binding sites of several transcriptional regulatory factors existed. Among them, it was shown that a basic transcription element binding factor (BTEB) actually interacted with the 5' flanking region of this gene.
Transcriptome and gene expression analysis during flower blooming in Rosa chinensis 'Pallida'.
Yan, Huijun; Zhang, Hao; Chen, Min; Jian, Hongying; Baudino, Sylvie; Caissard, Jean-Claude; Bendahmane, Mohammed; Li, Shubin; Zhang, Ting; Zhou, Ningning; Qiu, Xianqin; Wang, Qigang; Tang, Kaixue
2014-04-25
Rosa chinensis 'Pallida' (Rosa L.) is one of the most important ancient rose cultivars originating from China. It contributed the 'tea scent' trait to modern roses. However, little information is available on the gene regulatory networks involved in scent biosynthesis and metabolism in Rosa. In this study, the transcriptome of R. chinensis 'Pallida' petals at different developmental stages, from flower buds to senescent flowers, was investigated using Illumina sequencing technology. De novo assembly generated 89,614 clusters with an average length of 428bp. Based on sequence similarity search with known proteins, 62.9% of total clusters were annotated. Out of these annotated transcripts, 25,705 and 37,159 sequences were assigned to gene ontology and clusters of orthologous groups, respectively. The dataset provides information on transcripts putatively associated with known scent metabolic pathways. Digital gene expression (DGE) was obtained using RNA samples from flower bud, open flower and senescent flower stages. Comparative DGE and quantitative real time PCR permitted the identification of five transcripts encoding proteins putatively associated with scent biosynthesis in roses. The study provides a foundation for scent-related gene discovery in roses. Copyright © 2014. Published by Elsevier B.V.
Huang, You-Jun; Liu, Li-Li; Huang, Jian-Qin; Wang, Zheng-Jia; Chen, Fang-Fang; Zhang, Qi-Xiang; Zheng, Bing-Song; Chen, Ming
2013-10-10
Different from herbaceous plants, the woody plants undergo a long-period vegetative stage to achieve floral transition. They then turn into seasonal plants, flowering annually. In this study, a preliminary model of gene regulations for seasonal pistillate flowering in hickory (Carya cathayensis) was proposed. The genome-wide dynamic transcriptome was characterized via the joint-approach of RNA sequencing and microarray analysis. Differential transcript abundance analysis uncovered the dynamic transcript abundance patterns of flowering correlated genes and their major functions based on Gene Ontology (GO) analysis. To explore pistillate flowering mechanism in hickory, a comprehensive flowering gene regulatory network based on Arabidopsis thaliana was constructed by additional literature mining. A total of 114 putative flowering or floral genes including 31 with differential transcript abundance were identified in hickory. The locations, functions and dynamic transcript abundances were analyzed in the gene regulatory networks. A genome-wide co-expression network for the putative flowering or floral genes shows three flowering regulatory modules corresponding to response to light abiotic stimulus, cold stress, and reproductive development process, respectively. Totally 27 potential flowering or floral genes were recruited which are meaningful to understand the hickory specific seasonal flowering mechanism better. Flowering event of pistillate flower bud in hickory is triggered by several pathways synchronously including the photoperiod, autonomous, vernalization, gibberellin, and sucrose pathway. Totally 27 potential flowering or floral genes were recruited from the genome-wide co-expression network function module analysis. Moreover, the analysis provides a potential FLC-like gene based vernalization pathway and an 'AC' model for pistillate flower development in hickory. This work provides an available framework for pistillate flower development in hickory, which is significant for insight into regulation of flowering and floral development of woody plants.
2013-01-01
Background Different from herbaceous plants, the woody plants undergo a long-period vegetative stage to achieve floral transition. They then turn into seasonal plants, flowering annually. In this study, a preliminary model of gene regulations for seasonal pistillate flowering in hickory (Carya cathayensis) was proposed. The genome-wide dynamic transcriptome was characterized via the joint-approach of RNA sequencing and microarray analysis. Results Differential transcript abundance analysis uncovered the dynamic transcript abundance patterns of flowering correlated genes and their major functions based on Gene Ontology (GO) analysis. To explore pistillate flowering mechanism in hickory, a comprehensive flowering gene regulatory network based on Arabidopsis thaliana was constructed by additional literature mining. A total of 114 putative flowering or floral genes including 31 with differential transcript abundance were identified in hickory. The locations, functions and dynamic transcript abundances were analyzed in the gene regulatory networks. A genome-wide co-expression network for the putative flowering or floral genes shows three flowering regulatory modules corresponding to response to light abiotic stimulus, cold stress, and reproductive development process, respectively. Totally 27 potential flowering or floral genes were recruited which are meaningful to understand the hickory specific seasonal flowering mechanism better. Conclusions Flowering event of pistillate flower bud in hickory is triggered by several pathways synchronously including the photoperiod, autonomous, vernalization, gibberellin, and sucrose pathway. Totally 27 potential flowering or floral genes were recruited from the genome-wide co-expression network function module analysis. Moreover, the analysis provides a potential FLC-like gene based vernalization pathway and an 'AC’ model for pistillate flower development in hickory. This work provides an available framework for pistillate flower development in hickory, which is significant for insight into regulation of flowering and floral development of woody plants. PMID:24106755
De novo mutations in regulatory elements in neurodevelopmental disorders
Short, Patrick J.; McRae, Jeremy F.; Gallone, Giuseppe; Sifrim, Alejandro; Won, Hyejung; Geschwind, Daniel H.; Wright, Caroline F.; Firth, Helen V; FitzPatrick, David R.; Barrett, Jeffrey C.; Hurles, Matthew E.
2018-01-01
We previously estimated that 42% of patients with severe developmental disorders carry pathogenic de novo mutations in coding sequences. The role of de novo mutations in regulatory elements affecting genes associated with developmental disorders, or other genes, has been essentially unexplored. We identified de novo mutations in three classes of putative regulatory elements in almost 8,000 patients with developmental disorders. Here we show that de novo mutations in highly evolutionarily conserved fetal brain-active elements are significantly and specifically enriched in neurodevelopmental disorders. We identified a significant twofold enrichment of recurrently mutated elements. We estimate that, genome-wide, 1-3% of patients without a diagnostic coding variant carry pathogenic de novo mutations in fetal brain-active regulatory elements and that only 0.15% of all possible mutations within highly conserved fetal brain-active elements cause neurodevelopmental disorders with a dominant mechanism. Our findings represent a robust estimate of the contribution of de novo mutations in regulatory elements to this genetically heterogeneous set of disorders, and emphasize the importance of combining functional and evolutionary evidence to identify regulatory causes of genetic disorders. PMID:29562236
Computational methods in sequence and structure prediction
NASA Astrophysics Data System (ADS)
Lang, Caiyi
This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed a software package which is capable of designing novel protein structures at the atomic resolution. This software package allows us to perform protein structure design with a flexible backbone. The backbone flexibility includes loop region relaxation as well as a secondary structure collective mode relaxation scheme. (Abstract shortened by UMI.)
Molecular Structure and Transformation of the Glucose Dehydrogenase Gene in Drosophila Melanogaster
Whetten, R.; Organ, E.; Krasney, P.; Cox-Foster, D.; Cavener, D.
1988-01-01
We have precisely mapped and sequenced the three 5' exons of the Drosophila melanogaster Gld gene and have identified the start sites for transcription and translation. The first exon is composed of 335 nucleotides and does not contain any putative translation start codons. The second exon is separated from the first exon by 8 kb and contains the Gld translation start codon. The inferred amino acid sequence of the amino terminus contains two unusual features: three tandem repeats of serine-alanine, and a relatively high density of cysteine residues. P element-mediated transformation experiments demonstrated that a 17.5-kb genomic fragment contains the functional and regulatory components of the Gld gene. PMID:3143620
NASA Astrophysics Data System (ADS)
Zaghdoudi-Allan, N.; Yarra, T.; Churcher, A.; Felix, R. C.; Cardoso, J.; Clark, M.; Power, D. M.
2016-02-01
With over 90,000 extant species, the Mollusca is one of the most successful and species-rich phyla, comprising 23% of known marine fauna. Common to all molluscs, the mantle is a multi-functional highly muscular tissue that contacts the shell and envelops vital organs. In bivalves, the epithelial cells of the mantle secrete the external shell by a complex network of mechanisms that remain poorly understood. To date, the bulk of the work on Mytilus mantle has focused on two of its features: the mantle edge and the pallial mantle and relatively little is known about the factors regulating its function. We hypothesize that the mantle edge in Mytilus species is heterogeneous in cellular structure and function and use next generation sequencing to mine for receptors involved in biomineralization. The mantle edge of the Mediterranean mussel (Mytilus galloprovincialis) was sectioned into three parts and sequenced using the Illumina platform. The transcriptome sequences generated assembled into 179,879 transcripts with a 34% GC content, congruent with other bivalve asssemblies. The transcriptome was annotated and String analysis (http://www.string-db.org) was used for a preliminary characterisation of biological processes. To test our hypothesis, we compared the transcripts from the 3 mantle segments and the expression levels of putative receptors such as the G -protein coupled receptors (GPCRs) in the sectioned mantle of 6 individuals using qPCR. Candidates were chosen based on their regulatory function and potential involvement in shell formation. Our results show differences in transcript abundance and cellular function amongst the three mantle sections. Combining our transcriptomic study with histological studies of the mantle tissue, we present evidence of both molecular and structural heterogeneity of the mussel mantle and identify several putative regulatory networks.
Pauciullo, Alfredo; Erhardt, Georg
2015-01-01
In the present paper, we report for the first time the characterization of llama (Lama glama) caseins at transcriptomic and genetic level. A total of 288 casein clones transcripts were analysed from two lactating llamas. The most represented mRNA populations were those correctly assembled (85.07%) and they encoded for mature proteins of 215, 217, 187 and 162 amino acids respectively for the CSN1S1, CSN2, CSN1S2 and CSN3 genes. The exonic subdivision evidenced a structure made of 21, 9, 17 and 6 exons for the αs1-, β-, αs2- and κ-casein genes respectively. Exon skipping and duplication events were evidenced. Two variants A and B were identified in the αs1-casein gene as result of the alternative out-splicing of the exon 18. An additional exon coding for a novel esapeptide was found to be cryptic in the κ-casein gene, whereas one extra exon was found in the αs2-casein gene by the comparison with the Camelus dromedaries sequence. A total of 28 putative phosphorylated motifs highlighted a complex heterogeneity and a potential variable degree of post-translational modifications. Ninety-six polymorphic sites were found through the comparison of the lama casein cDNAs with the homologous camel sequences, whereas the first description and characterization of the 5’- and 3’-regulatory regions allowed to identify the main putative consensus sequences involved in the casein genes expression, thus opening the way to new investigations -so far- never achieved in this species. PMID:25923814
Pauciullo, Alfredo; Erhardt, Georg
2015-01-01
In the present paper, we report for the first time the characterization of llama (Lama glama) caseins at transcriptomic and genetic level. A total of 288 casein clones transcripts were analysed from two lactating llamas. The most represented mRNA populations were those correctly assembled (85.07%) and they encoded for mature proteins of 215, 217, 187 and 162 amino acids respectively for the CSN1S1, CSN2, CSN1S2 and CSN3 genes. The exonic subdivision evidenced a structure made of 21, 9, 17 and 6 exons for the αs1-, β-, αs2- and κ-casein genes respectively. Exon skipping and duplication events were evidenced. Two variants A and B were identified in the αs1-casein gene as result of the alternative out-splicing of the exon 18. An additional exon coding for a novel esapeptide was found to be cryptic in the κ-casein gene, whereas one extra exon was found in the αs2-casein gene by the comparison with the Camelus dromedaries sequence. A total of 28 putative phosphorylated motifs highlighted a complex heterogeneity and a potential variable degree of post-translational modifications. Ninety-six polymorphic sites were found through the comparison of the lama casein cDNAs with the homologous camel sequences, whereas the first description and characterization of the 5'- and 3'-regulatory regions allowed to identify the main putative consensus sequences involved in the casein genes expression, thus opening the way to new investigations -so far- never achieved in this species.
Canver, Matthew C; Lessard, Samuel; Pinello, Luca; Wu, Yuxuan; Ilboudo, Yann; Stern, Emily N; Needleman, Austen J; Galactéros, Frédéric; Brugnara, Carlo; Kutlar, Abdullah; McKenzie, Colin; Reid, Marvin; Chen, Diane D; Das, Partha Pratim; A Cole, Mitchel; Zeng, Jing; Kurita, Ryo; Nakamura, Yukio; Yuan, Guo-Cheng; Lettre, Guillaume; Bauer, Daniel E; Orkin, Stuart H
2017-04-01
Cas9-mediated, high-throughput, saturating in situ mutagenesis permits fine-mapping of function across genomic segments. Disease- and trait-associated variants identified in genome-wide association studies largely cluster at regulatory loci. Here we demonstrate the use of multiple designer nucleases and variant-aware library design to interrogate trait-associated regulatory DNA at high resolution. We developed a computational tool for the creation of saturating-mutagenesis libraries with single or multiple nucleases with incorporation of variants. We applied this methodology to the HBS1L-MYB intergenic region, which is associated with red-blood-cell traits, including fetal hemoglobin levels. This approach identified putative regulatory elements that control MYB expression. Analysis of genomic copy number highlighted potential false-positive regions, thus emphasizing the importance of off-target analysis in the design of saturating-mutagenesis experiments. Together, these data establish a widely applicable high-throughput and high-resolution methodology to identify minimal functional sequences within large disease- and trait-associated regions.
Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificities
Narasimhan, Kamesh; Lambert, Samuel A; Yang, Ally WH; Riddell, Jeremy; Mnaimneh, Sanie; Zheng, Hong; Albu, Mihai; Najafabadi, Hamed S; Reece-Hoyes, John S; Fuxman Bass, Juan I; Walhout, Albertha JM; Weirauch, Matthew T; Hughes, Timothy R
2015-01-01
Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (∼40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology and also identifies putative regulatory roles for unstudied TFs. DOI: http://dx.doi.org/10.7554/eLife.06967.001 PMID:25905672
Transcript Analysis and Regulative Events during Flower Development in Olive (Olea europaea L.).
Alagna, Fiammetta; Cirilli, Marco; Galla, Giulio; Carbone, Fabrizio; Daddiego, Loretta; Facella, Paolo; Lopez, Loredana; Colao, Chiara; Mariotti, Roberto; Cultrera, Nicolò; Rossi, Martina; Barcaccia, Gianni; Baldoni, Luciana; Muleo, Rosario; Perrotta, Gaetano
2016-01-01
The identification and characterization of transcripts involved in flower organ development, plant reproduction and metabolism represent key steps in plant phenotypic and physiological pathways, and may generate high-quality transcript variants useful for the development of functional markers. This study was aimed at obtaining an extensive characterization of the olive flower transcripts, by providing sound information on the candidate MADS-box genes related to the ABC model of flower development and on the putative genetic and molecular determinants of ovary abortion and pollen-pistil interaction. The overall sequence data, obtained by pyrosequencing of four cDNA libraries from flowers at different developmental stages of three olive varieties with distinct reproductive features (Leccino, Frantoio and Dolce Agogia), included approximately 465,000 ESTs, which gave rise to more than 14,600 contigs and approximately 92,000 singletons. As many as 56,700 unigenes were successfully annotated and provided gene ontology insights into the structural organization and putative molecular function of sequenced transcripts and deduced proteins in the context of their corresponding biological processes. Differentially expressed genes with potential regulatory roles in biosynthetic pathways and metabolic networks during flower development were identified. The gene expression studies allowed us to select the candidate genes that play well-known molecular functions in a number of biosynthetic pathways and specific biological processes that affect olive reproduction. A sound understanding of gene functions and regulatory networks that characterize the olive flower is provided.
Transcript Analysis and Regulative Events during Flower Development in Olive (Olea europaea L.)
Alagna, Fiammetta; Cirilli, Marco; Galla, Giulio; Carbone, Fabrizio; Daddiego, Loretta; Facella, Paolo; Lopez, Loredana; Colao, Chiara; Mariotti, Roberto; Cultrera, Nicolò; Rossi, Martina; Barcaccia, Gianni; Baldoni, Luciana; Muleo, Rosario; Perrotta, Gaetano
2016-01-01
The identification and characterization of transcripts involved in flower organ development, plant reproduction and metabolism represent key steps in plant phenotypic and physiological pathways, and may generate high-quality transcript variants useful for the development of functional markers. This study was aimed at obtaining an extensive characterization of the olive flower transcripts, by providing sound information on the candidate MADS-box genes related to the ABC model of flower development and on the putative genetic and molecular determinants of ovary abortion and pollen-pistil interaction. The overall sequence data, obtained by pyrosequencing of four cDNA libraries from flowers at different developmental stages of three olive varieties with distinct reproductive features (Leccino, Frantoio and Dolce Agogia), included approximately 465,000 ESTs, which gave rise to more than 14,600 contigs and approximately 92,000 singletons. As many as 56,700 unigenes were successfully annotated and provided gene ontology insights into the structural organization and putative molecular function of sequenced transcripts and deduced proteins in the context of their corresponding biological processes. Differentially expressed genes with potential regulatory roles in biosynthetic pathways and metabolic networks during flower development were identified. The gene expression studies allowed us to select the candidate genes that play well-known molecular functions in a number of biosynthetic pathways and specific biological processes that affect olive reproduction. A sound understanding of gene functions and regulatory networks that characterize the olive flower is provided. PMID:27077738
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species. PMID:22408741
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species.
The Silkworm (Bombyx mori) microRNAs and Their Expressions in Multiple Developmental Stages
Luo, Qibin; Cai, Yimei; Lin, Wen-chang; Chen, Huan; Yang, Yue; Hu, Songnian; Yu, Jun
2008-01-01
Background MicroRNAs (miRNAs) play crucial roles in various physiological processes through post-transcriptional regulation of gene expressions and are involved in development, metabolism, and many other important molecular mechanisms and cellular processes. The Bombyx mori genome sequence provides opportunities for a thorough survey for miRNAs as well as comparative analyses with other sequenced insect species. Methodology/Principal Findings We identified 114 non-redundant conserved miRNAs and 148 novel putative miRNAs from the B. mori genome with an elaborate computational protocol. We also sequenced 6,720 clones from 14 developmental stage-specific small RNA libraries in which we identified 35 unique miRNAs containing 21 conserved miRNAs (including 17 predicted miRNAs) and 14 novel miRNAs (including 11 predicted novel miRNAs). Among the 114 conserved miRNAs, we found six pairs of clusters evolutionarily conserved cross insect lineages. Our observations on length heterogeneity at 5′ and/or 3′ ends of nine miRNAs between cloned and predicted sequences, and three mature forms deriving from the same arm of putative pre-miRNAs suggest a mechanism by which miRNAs gain new functions. Analyzing development-related miRNAs expression at 14 developmental stages based on clone-sampling and stem-loop RT PCR, we discovered an unusual abundance of 33 sequences representing 12 different miRNAs and sharply fluctuated expression of miRNAs at larva-molting stage. The potential functions of several stage-biased miRNAs were also analyzed in combination with predicted target genes and silkworm's phenotypic traits; our results indicated that miRNAs may play key regulatory roles in specific developmental stages in the silkworm, such as ecdysis. Conclusions/Significance Taking a combined approach, we identified 118 conserved miRNAs and 151 novel miRNA candidates from the B. mori genome sequence. Our expression analyses by sampling miRNAs and real-time PCR over multiple developmental stages allowed us to pinpoint molting stages as hotspots of miRNA expression both in sorts and quantities. Based on the analysis of target genes, we hypothesized that miRNAs regulate development through a particular emphasis on complex stages rather than general regulatory mechanisms. PMID:18714353
Acebo, Paloma; Martin-Galiano, Antonio J.; Navarro, Sara; Zaballos, Ángel; Amblar, Mónica
2012-01-01
Streptococcus pneumoniae is the main etiological agent of community-acquired pneumonia and a major cause of mortality and morbidity among children and the elderly. Genome sequencing of several pneumococcal strains revealed valuable information about the potential proteins and genetic diversity of this prevalent human pathogen. However, little is known about its transcriptional regulation and its small regulatory noncoding RNAs. In this study, we performed deep sequencing of the S. pneumoniae TIGR4 strain RNome to identify small regulatory RNA candidates expressed in this pathogen. We discovered 1047 potential small RNAs including intragenic, 5′- and/or 3′-overlapping RNAs and 88 small RNAs encoded in intergenic regions. With this approach, we recovered many of the previously identified intergenic small RNAs and identified 68 novel candidates, most of which are conserved in both sequence and genomic context in other S. pneumoniae strains. We confirmed the independent expression of 17 intergenic small RNAs and predicted putative mRNA targets for six of them using bioinformatics tools. Preliminary results suggest that one of these six is a key player in the regulation of competence development. This study is the biggest catalog of small noncoding RNAs reported to date in S. pneumoniae and provides a highly complete view of the small RNA network in this pathogen. PMID:22274957
Systematic variation in mRNA 3′-processing signals during mouse spermatogenesis
Liu, Donglin; Brockman, J. Michael; Dass, Brinda; Hutchins, Lucie N.; Singh, Priyam; McCarrey, John R.; MacDonald, Clinton C.; Graber, Joel H.
2007-01-01
Gene expression and processing during mouse male germ cell maturation (spermatogenesis) is highly specialized. Previous reports have suggested that there is a high incidence of alternative 3′-processing in male germ cell mRNAs, including reduced usage of the canonical polyadenylation signal, AAUAAA. We used EST libraries generated from mouse testicular cells to identify 3′-processing sites used at various stages of spermatogenesis (spermatogonia, spermatocytes and round spermatids) and testicular somatic Sertoli cells. We assessed differences in 3′-processing characteristics in the testicular samples, compared to control sets of widely used 3′-processing sites. Using a new method for comparison of degenerate regulatory elements between sequence samples, we identified significant changes in the use of putative 3′-processing regulatory sequence elements in all spermatogenic cell types. In addition, we observed a trend towards truncated 3′-untranslated regions (3′-UTRs), with the most significant differences apparent in round spermatids. In contrast, Sertoli cells displayed a much smaller trend towards 3′-UTR truncation and no significant difference in 3′-processing regulatory sequences. Finally, we identified a number of genes encoding mRNAs that were specifically subject to alternative 3′-processing during meiosis and postmeiotic development. Our results highlight developmental differences in polyadenylation site choice and in the elements that likely control them during spermatogenesis. PMID:17158511
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.
1997-02-01
Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less
The leukocyte common antigen (CD45): a putative receptor-linked protein tyrosine phosphatase.
Charbonneau, H; Tonks, N K; Walsh, K A; Fischer, E H
1988-01-01
A major protein tyrosine phosphatase (PTPase 1B) has been isolated in essentially homogeneous form from the soluble and particulate fractions of human placenta. Unexpectedly, partial amino acid sequences displayed no homology with the primary structures of the protein Ser/Thr phosphatases deduced from cDNA clones. However, the sequence is strikingly similar to the tandem C-terminal homologous domains of the leukocyte common antigen (CD45). A 157-residue segment of PTPase 1B displayed 40% and 33% sequence identity with corresponding regions from cytoplasmic domains I and II of human CD45. Similar degrees of identity have been observed among the catalytic domains of families of regulatory proteins such as protein kinases and cyclic nucleotide phosphodiesterases. On this basis, it is proposed that the CD45 family has protein tyrosine phosphatase activity and may represent a set of cell-surface receptors involved in signal transduction. This suggests that the repertoire of signal transduction mechanisms may include the direct control of an intracellular protein tyrosine phosphatase, offering the possibility of a regulatory balance with those protein tyrosine kinases that act at the internal surface of the membrane. Images PMID:2845400
Targeting the Mevalonate Pathway to Reduce Mortality from Ovarian Cancer
2017-12-01
at cis-regulatory elements such as enhancers to facilitate gene transcription. CRISPR /Cas9- mediated ablation of a putative Meis1 enhancer carrying...Tables S4 and S5. 10 Cancer Cell 30, 1–16, July 11, 2016the CRISPR /Cas9-based genomic editing technology. Cas9 and a pair of single guide RNAs (sgRNA... CRISPR /Cas9-mediated deletio sgMeis1, a pair of sgRNAs that target the DMR boundaries. (N) Sequencing of the genomic PCR products from F2/R2 primers shows
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules
Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.
Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Rosinski-Chupin, Isabelle; Sauvage, Elisabeth; Sismeiro, Odile; Villain, Adrien; Da Cunha, Violette; Caliot, Marie-Elise; Dillies, Marie-Agnès; Trieu-Cuot, Patrick; Bouloc, Philippe; Lartigue, Marie-Frédérique; Glaser, Philippe
2015-05-30
Streptococcus agalactiae, or Group B Streptococcus, is a leading cause of neonatal infections and an increasing cause of infections in adults with underlying diseases. In an effort to reconstruct the transcriptional networks involved in S. agalactiae physiology and pathogenesis, we performed an extensive and robust characterization of its transcriptome through a combination of differential RNA-sequencing in eight different growth conditions or genetic backgrounds and strand-specific RNA-sequencing. Our study identified 1,210 transcription start sites (TSSs) and 655 transcript ends as well as 39 riboswitches and cis-regulatory regions, 39 cis-antisense non-coding RNAs and 47 small RNAs potentially acting in trans. Among these putative regulatory RNAs, ten were differentially expressed in response to an acid stress and two riboswitches sensed directly or indirectly the pH modification. Strikingly, 15% of the TSSs identified were associated with the incorporation of pseudo-templated nucleotides, showing that reiterative transcription is a pervasive process in S. agalactiae. In particular, 40% of the TSSs upstream genes involved in nucleotide metabolism show reiterative transcription potentially regulating gene expression, as exemplified for pyrG and thyA encoding the CTP synthase and the thymidylate synthase respectively. This comprehensive map of the transcriptome at the single nucleotide resolution led to the discovery of new regulatory mechanisms in S. agalactiae. It also provides the basis for in depth analyses of transcriptional networks in S. agalactiae and of the regulatory role of reiterative transcription following variations of intra-cellular nucleotide pools.
Discriminative prediction of mammalian enhancers from DNA sequence
Lee, Dongwon; Karchin, Rachel; Beer, Michael A.
2011-01-01
Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers. PMID:21875935
Erpen, L; Tavano, E C R; Harakava, R; Dutt, M; Grosser, J W; Piedade, S M S; Mendes, B M J; Mourão Filho, F A A
2018-05-23
Regulatory sequences from the citrus constitutive genes cyclophilin (CsCYP), glyceraldehyde-3-phosphate dehydrogenase C2 (CsGAPC2), and elongation factor 1-alpha (CsEF1) were isolated, fused to the uidA gene, and qualitatively and quantitatively evaluated in transgenic sweet orange plants. The 5' upstream region of a gene (the promoter) is the most important component for the initiation and regulation of gene transcription of both native genes and transgenes in plants. The isolation and characterization of gene regulatory sequences are essential to the development of intragenic or cisgenic genetic manipulation strategies, which imply the use of genetic material from the same species or from closely related species. We describe herein the isolation and evaluation of the promoter sequence from three constitutively expressed citrus genes: cyclophilin (CsCYP), glyceraldehyde-3-phosphate dehydrogenase C2 (CsGAPC2), and elongation factor 1-alpha (CsEF1). The functionality of the promoters was confirmed by a histochemical GUS assay in leaves, stems, and roots of stably transformed citrus plants expressing the promoter-uidA construct. Lower uidA mRNA levels were detected when the transgene was under the control of citrus promoters as compared to the expression under the control of the CaMV35S promoter. The association of the uidA gene with the citrus-derived promoters resulted in mRNA levels of up to 60-41.8% of the value obtained with the construct containing CaMV35S driving the uidA gene. Moreover, a lower inter-individual variability in transgene expression was observed amongst the different transgenic lines, where gene constructs containing citrus-derived promoters were used. In silico analysis of the citrus-derived promoter sequences revealed that their activity may be controlled by several putative cis-regulatory elements. These citrus promoters will expand the availability of regulatory sequences for driving gene expression in citrus gene-modification programs.
Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J
2014-12-19
The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.
Burzynski, Grzegorz M.; Reed, Xylena; Taher, Leila; Stine, Zachary E.; Matsui, Takeshi; Ovcharenko, Ivan; McCallion, Andrew S.
2012-01-01
Illuminating the primary sequence encryption of enhancers is central to understanding the regulatory architecture of genomes. We have developed a machine learning approach to decipher motif patterns of hindbrain enhancers and identify 40,000 sequences in the human genome that we predict display regulatory control that includes the hindbrain. Consistent with their roles in hindbrain patterning, MEIS1, NKX6-1, as well as HOX and POU family binding motifs contributed strongly to this enhancer model. Predicted hindbrain enhancers are overrepresented at genes expressed in hindbrain and associated with nervous system development, and primarily reside in the areas of open chromatin. In addition, 77 (0.2%) of these predictions are identified as hindbrain enhancers on the VISTA Enhancer Browser, and 26,000 (60%) overlap enhancer marks (H3K4me1 or H3K27ac). To validate these putative hindbrain enhancers, we selected 55 elements distributed throughout our predictions and six low scoring controls for evaluation in a zebrafish transgenic assay. When assayed in mosaic transgenic embryos, 51/55 elements directed expression in the central nervous system. Furthermore, 30/34 (88%) predicted enhancers analyzed in stable zebrafish transgenic lines directed expression in the larval zebrafish hindbrain. Subsequent analysis of sequence fragments selected based upon motif clustering further confirmed the critical role of the motifs contributing to the classifier. Our results demonstrate the existence of a primary sequence code characteristic to hindbrain enhancers. This code can be accurately extracted using machine-learning approaches and applied successfully for de novo identification of hindbrain enhancers. This study represents a critical step toward the dissection of regulatory control in specific neuronal subtypes. PMID:22759862
Reizer, J.; Hoischen, C.; Reizer, A.; Pham, T. N.; Saier, M. H.
1993-01-01
We have previously reported the overexpression, purification, and biochemical properties of the Bacillus subtilis Enzyme I of the phosphoenolpyruvate: sugar phosphotransferase system (PTS) (Reizer, J., et al., 1992, J. Biol. Chem. 267, 9158-9169). We now report the sequencing of the ptsI gene of B. subtilis encoding Enzyme I (570 amino acids and 63,076 Da). Putative transcriptional regulatory signals are identified, and the pts operon is shown to be subject to carbon source-dependent regulation. Multiple alignments of the B. subtilis Enzyme I with (1) six other sequenced Enzymes I of the PTS from various bacterial species, (2) phosphoenolpyruvate synthase of Escherichia coli, and (3) bacterial and plant pyruvate: phosphate dikinases (PPDKs) revealed regions of sequence similarity as well as divergence. Statistical analyses revealed that these three types of proteins comprise a homologous family, and the phylogenetic tree of the 11 sequenced protein members of this family was constructed. This tree was compared with that of the 12 sequence HPr proteins or protein domains. Antibodies raised against the B. subtilis and E. coli Enzymes I exhibited immunological cross-reactivity with each other as well as with PPDK of Bacteroides symbiosus, providing support for the evolutionary relationships of these proteins suggested from the sequence comparisons. Putative flexible linkers tethering the N-terminal and the C-terminal domains of protein members of the Enzyme I family were identified, and their potential significance with regard to Enzyme I function is discussed. The codon choice pattern of the B. subtilis and E. coli ptsI and ptsH genes was found to exhibit a bias toward optimal codons in these organisms.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7686067
Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd
2013-01-01
The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO 2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can serve as a basis for future functional studies of transcriptional regulator genes and genomic regulatory elements in Chlamydomonas. PMID:24224019
Satheesh, Viswanathan; Jagannadham, P Tej Kumar; Chidambaranathan, Parameswaran; Jain, P K; Srinivasan, R
2014-12-01
The NAC (NAM, ATAF and CUC) proteins are plant-specific transcription factors implicated in development and stress responses. In the present study 88 pigeonpea NAC genes were identified from the recently published draft genome of pigeonpea by using homology based and de novo prediction programmes. These sequences were further subjected to phylogenetic, motif and promoter analyses. In motif analysis, highly conserved motifs were identified in the NAC domain and also in the C-terminal region of the NAC proteins. A phylogenetic reconstruction using pigeonpea, Arabidopsis and soybean NAC genes revealed 33 putative stress-responsive pigeonpea NAC genes. Several stress-responsive cis-elements were identified through in silico analysis of the promoters of these putative stress-responsive genes. This analysis is the first report of NAC gene family in pigeonpea and will be useful for the identification and selection of candidate genes associated with stress tolerance.
Genome-wide identification of Hami melon miRNAs with putative roles during fruit development
Wang, Guangzhi; Ma, Xinli; Li, Meihua; Wu, Haibo; Fu, Qiushi; Zhang, Yi; Yi, Hongping
2017-01-01
MicroRNAs represent a family of small endogenous, non-coding RNAs that play critical regulatory roles in plant growth, development, and environmental stress responses. Hami melon is famous for its attractive flavor and excellent nutritional value, however, the mechanisms underlying the fruit development and ripening remains largely unknown. Here, we performed small RNA sequencing to investigate the roles of miRNAs during Hami melon fruit development. Two batches of flesh samples were collected at four fruit development stages. Small RNA sequencing yielded a total of 54,553,424 raw reads from eight libraries. 113 conserved miRNAs belonging to 30 miRNA families and nine novel miRNAs comprising nine miRNA families were identified. The expression of 42 conserved miRNAs and three Hami melon-specific miRNAs significantly changed during fruit development. Furthermore, 484 and 124 melon genes were predicted as putative targets of 29 conserved and nine Hami melon-specific miRNA families, respectively. GO enrichment analysis were performed on target genes, “transcription, DNA-dependent”, “rRNA processing”, “oxidation reduction”, “signal transduction”, “regulation of transcription, DNA-dependent”, and “metabolic process” were the over-represented biological process terms. Cleavage sites of six target genes were validated using 5’ RACE. Our results present a comprehensive set of identification and characterization of Hami melon fruit miRNAs and their potential targets, which provide valuable basis towards understanding the regulatory mechanisms in programmed process of normal Hami fruit development and ripening. Specific miRNAs could be selected for further research and applications in breeding practices. PMID:28742088
Suetomi, Yuta; Matsuda, Fuko; Uenoyama, Yoshihisa; Maeda, Kei-ichiro; Tsukamura, Hiroko; Ohkura, Satoshi
2013-10-01
Neurokinin B (NKB), encoded by TAC3, is thought to be an important accelerator of pulsatile gonadotropin-releasing hormone release. This study aimed to clarify the transcriptional regulatory mechanism of goat TAC3. First, we determined the full-length mRNA sequence of goat TAC3 from the hypothalamus to be 820 b, including a 381 b coding region, with the putative transcription start site located 143-b upstream of the start codon. The deduced amino acid sequence of NKB, which is produced from preproNKB, was completely conserved among goat, cattle, and human. Next, we cloned 5'-upstream region of goat TAC3 up to 3400 b from the translation initiation site, and this region was highly homologous with cattle TAC3 (89%). We used this goat TAC3 5'-upstream region to perform luciferase assays. We created a luciferase reporter vector containing DNA constructs from -2706, -1837, -834, -335, or -197 to +166 bp (the putative transcription start site was designated as +1) of goat TAC3 and these were transiently transfected into mouse hypothalamus-derived N7 cells and human neuroblastoma-derived SK-N-AS cells. The luciferase activity gradually increased with the deletion of the 5'-upstream region, suggesting that the transcriptional suppressive region is located between -2706 and -336 bp and that the core promoter exists downstream of -197 bp. Estradiol treatment did not lead to significant suppression of luciferase activity of any constructs, suggesting the existence of other factor(s) that regulate goat TAC3 transcription.
Hafemeister, Christoph; Nicotra, Adrienne B.; Jagadish, S.V. Krishna; Bonneau, Richard; Purugganan, Michael
2016-01-01
Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference. PMID:27655842
Defining Transcriptional Regulatory Mechanisms for Primary let-7 miRNAs
Gaeta, Xavier; Le, Luat; Lin, Ying; Xie, Yuan; Lowry, William E.
2017-01-01
The let-7 family of miRNAs have been shown to control developmental timing in organisms from C. elegans to humans; their function in several essential cell processes throughout development is also well conserved. Numerous studies have defined several steps of post-transcriptional regulation of let-7 production; from pri-miRNA through pre-miRNA, to the mature miRNA that targets endogenous mRNAs for degradation or translational inhibition. Less-well defined are modes of transcriptional regulation of the pri-miRNAs for let-7. let-7 pri-miRNAs are expressed in polycistronic fashion, in long transcripts newly annotated based on chromatin-associated RNA-sequencing. Upon differentiation, we found that some let-7 pri-miRNAs are regulated at the transcriptional level, while others appear to be constitutively transcribed. Using the Epigenetic Roadmap database, we further annotated regulatory elements of each polycistron identified putative promoters and enhancers. Probing these regulatory elements for transcription factor binding sites identified factors that regulate transcription of let-7 in both promoter and enhancer regions, and identified novel regulatory mechanisms for this important class of miRNAs. PMID:28052101
2010-01-01
Background Plants trigger and tailor defense responses after perception of the oral secretions (OS) of attacking specialist lepidopteran larvae. Fatty acid-amino acid conjugates (FACs) in the OS of the Manduca sexta larvae are necessary and sufficient to elicit the herbivory-specific responses in Nicotiana attenuata, an annual wild tobacco species. How FACs are perceived and activate signal transduction mechanisms is unknown. Results We used SuperSAGE combined with 454 sequencing to quantify the early transcriptional changes elicited by the FAC N-linolenoyl-glutamic acid (18:3-Glu) and virus induced gene silencing (VIGS) to examine the function of candidate genes in the M. sexta-N. attenuata interaction. The analysis targeted mRNAs encoding regulatory components: rare transcripts with very rapid FAC-elicited kinetics (increases within 60 and declines within 120 min). From 12,744 unique Tag sequences identified (UniTags), 430 and 117 were significantly up- and down-regulated ≥ 2.5-fold, respectively, after 18:3-Glu elicitation compared to wounding. Based on gene ontology classification, more than 25% of the annotated UniTags corresponded to putative regulatory components, including 30 transcriptional regulators and 22 protein kinases. Quantitative PCR analysis was used to analyze the FAC-dependent regulation of a subset of 27 of these UniTags and for most of them a rapid and transient induction was confirmed. Six FAC-regulated genes were functionally characterized by VIGS and two, a putative lipid phosphate phosphatase (LPP) and a protein of unknown function, were identified as important mediators of the M. sexta-N. attenuata interaction. Conclusions The analysis of the early changes in the transcriptome of N. attenuata after FAC elicitation using SuperSAGE/454 has identified regulatory genes involved in insect-specific mediated responses in plants. Moreover, it has provided a foundation for the identification of additional novel regulators associated with this process. PMID:20398280
Gilardoni, Paola A; Schuck, Stefan; Jüngling, Ruth; Rotter, Björn; Baldwin, Ian T; Bonaventure, Gustavo
2010-04-14
Plants trigger and tailor defense responses after perception of the oral secretions (OS) of attacking specialist lepidopteran larvae. Fatty acid-amino acid conjugates (FACs) in the OS of the Manduca sexta larvae are necessary and sufficient to elicit the herbivory-specific responses in Nicotiana attenuata, an annual wild tobacco species. How FACs are perceived and activate signal transduction mechanisms is unknown. We used SuperSAGE combined with 454 sequencing to quantify the early transcriptional changes elicited by the FAC N-linolenoyl-glutamic acid (18:3-Glu) and virus induced gene silencing (VIGS) to examine the function of candidate genes in the M. sexta-N. attenuata interaction. The analysis targeted mRNAs encoding regulatory components: rare transcripts with very rapid FAC-elicited kinetics (increases within 60 and declines within 120 min). From 12,744 unique Tag sequences identified (UniTags), 430 and 117 were significantly up- and down-regulated >or= 2.5-fold, respectively, after 18:3-Glu elicitation compared to wounding. Based on gene ontology classification, more than 25% of the annotated UniTags corresponded to putative regulatory components, including 30 transcriptional regulators and 22 protein kinases. Quantitative PCR analysis was used to analyze the FAC-dependent regulation of a subset of 27 of these UniTags and for most of them a rapid and transient induction was confirmed. Six FAC-regulated genes were functionally characterized by VIGS and two, a putative lipid phosphate phosphatase (LPP) and a protein of unknown function, were identified as important mediators of the M. sexta-N. attenuata interaction. The analysis of the early changes in the transcriptome of N. attenuata after FAC elicitation using SuperSAGE/454 has identified regulatory genes involved in insect-specific mediated responses in plants. Moreover, it has provided a foundation for the identification of additional novel regulators associated with this process.
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.
Diallinas, G; Gorfinkiel, L; Arst, H N; Cecchetto, G; Scazzocchio, C
1995-04-14
In Aspergillus nidulans, loss-of-function mutations in the uapA and azgA genes, encoding the major uric acid-xanthine and hypoxanthine-adenine-guanine permeases, respectively, result in impaired utilization of these purines as sole nitrogen sources. The residual growth of the mutant strains is due to the activity of a broad specificity purine permease. We have identified uapC, the gene coding for this third permease through the isolation of both gain-of-function and loss-of-function mutations. Uptake studies with wild-type and mutant strains confirmed the genetic analysis and showed that the UapC protein contributes 30% and 8-10% to uric acid and hypoxanthine transport rates, respectively. The uapC gene was cloned, its expression studied, its sequence and transcript map established, and the sequence of its putative product analyzed. uapC message accumulation is: (i) weakly induced by 2-thiouric acid; (ii) repressed by ammonium; (iii) dependent on functional uaY and areA regulatory gene products (mediating uric acid induction and nitrogen metabolite repression, respectively); (iv) increased by uapC gain-of-function mutations which specifically, but partially, suppress a leucine to valine mutation in the zinc finger of the protein coded by the areA gene. The putative uapC gene product is a highly hydrophobic protein of 580 amino acids (M(r) = 61,251) including 12-14 putative transmembrane segments. The UapC protein is highly similar (58% identity) to the UapA permease and significantly similar (23-34% identity) to a number of bacterial transporters. Comparisons of the sequences and hydropathy profiles of members of this novel family of transporters yield insights into their structure, functionally important residues, and possible evolutionary relationships.
Non-B-DNA structures on the interferon-beta promoter?
Robbe, K; Bonnefoy, E
1998-01-01
The high mobility group (HMG) I protein intervenes as an essential factor during the virus induced expression of the interferon-beta (IFN-beta) gene. It is a non-histone chromatine associated protein that has the dual capacity of binding to a non-B-DNA structure such as cruciform-DNA as well as to AT rich B-DNA sequences. In this work we compare the binding affinity of HMGI for a synthetic cruciform-DNA to its binding affinity for the HMGI-binding-site present in the positive regulatory domain II (PRDII) of the IFN-beta promoter. Using gel retardation experiments, we show that HMGI protein binds with at least ten times more affinity to the synthetic cruciform-DNA structure than to the PRDII B-DNA sequence. DNA hairpin sequences are present in both the human and the murine PRDII-DNAs. We discuss in this work the presence of, yet putative, non-B-DNA structures in the IFN-beta promoter.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi
Highlights: {yields} The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. {yields} The core promoter was located in the 5F-1. {yields} Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. {yields} These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, butmore » little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.« less
Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S
2005-07-15
A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.
USDA-ARS?s Scientific Manuscript database
The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
NASA Astrophysics Data System (ADS)
Omar, Aimi Farehah; Ismail, Ismanizan
2016-11-01
Sesquiterpene synthase (SS) catalyzes the formation of sesquiterpenes from farnesyl diphosphate (FDP) via carbocation intermediates. In this study, the promoter region of sesquiterpene synthase was isolated from Persicaria minor to identify possible cis-acting elements in the promoter. The full-length PmSS promoter of P. minor is 1824-bp sequences. The sequence was analyzed and several putative cis-acting regulatory elements were identified. Three cis-acting regulatory elements were selected for deletion analysis which are cis-acting element involved in wound responsiveness (WUN), cis - acting element involved in defense and stress responsiveness (TC) and cis-acting element involved in ABA responsiveness (ABRE). Series of deletions were conducted to assess the promoter activity producing three truncated fragments promoter; Prom 2 1606-bp, Prom 3 1144- bp, and Prom 4 921-bp. The full-length promoter and its deletion series were cloned into the pBGWFS7 vector which contain β-glucuronidase (GUS) gene and green fluorescent protein (GFP) as the reporter gene. All constructs were successfully transformed into Arabidopsis thaliana based on PCR of positive BASTA resistance plants.
Oliveira, Letícia de C.; Silveira, Aline M. M.; Monteiro, Andréa de S.; dos Santos, Vera L.; Nicoli, Jacques R.; Azevedo, Vasco A. de C.; Soares, Siomar de C.; Dias-Souza, Marcus V.; Nardi, Regina M. D.
2017-01-01
A bacteriocinogenic Lactobacillus rhamnosus L156.4 strain isolated from the feces of NIH mice was identified by 16S rRNA gene sequencing and MALDI-TOF mass spectrometry. The entire genome was sequenced using Illumina, annotated in the PGAAP, and RAST servers, and deposited. Conserved genes associated with bacteriocin synthesis were predicted using BAGEL3, leading to the identification of an open reading frame (ORF) that shows homology with the L. rhamnosus GG (ATCC 53103) prebacteriocin gene. The encoded protein contains a conserved protein motif associated a structural gene of the Enterocin A superfamily. We found ORFs related to the prebacteriocin, immunity protein, ABC transporter proteins, and regulatory genes with 100% identity to those of L. rhamnosus HN001. In this study, we provide evidence of a putative bacteriocin produced by L. rhamnosus L156.4 that was further confirmed by in vitro assays. The antibacterial activity of the substances produced by this strain was evaluated using the deferred agar-spot and spot-on-the lawn assays, and a wide antimicrobial activity spectrum against human and foodborne pathogens was observed. The physicochemical characterization of the putative bacteriocin indicated that it was sensitive to proteolytic enzymes, heat stable and maintained its antibacterial activity in a pH ranging from 3 to 9. The activity against Lactobacillus fermentum, which was used as an indicator strain, was detected during bacterial logarithmic growth phase, and a positive correlation was confirmed between bacterial growth and production of the putative bacteriocin. After a partial purification from cell-free supernatant by salt precipitation, the putative bacteriocin migrated as a diffuse band of approximately 1.0–3.0 kDa by SDS-PAGE. Additional studies are being conducted to explore its use in the food industry for controlling bacterial growth and for probiotic applications. PMID:28579977
Molecular evolution of the HoxA cluster in the three major gnathostome lineages
Chiu, Chi-hua; Amemiya, Chris; Dewar, Ken; Kim, Chang-Bae; Ruddle, Frank H.; Wagner, Günter P.
2002-01-01
The duplication of Hox clusters and their maintenance in a lineage has a prominent but little understood role in chordate evolution. Here we examined how Hox cluster duplication may influence changes in cluster architecture and patterns of noncoding sequence evolution. We sequenced the entire duplicated HoxAa and HoxAb clusters of zebrafish (Danio rerio) and extended the 5′ (posterior) part of the HoxM (HoxA-like) cluster of horn shark (Heterodontus francisci) containing the hoxa11 and hoxa13 orthologs as well as intergenic and flanking noncoding sequences. The duplicated HoxA clusters in zebrafish each house considerably fewer genes and are dramatically shorter than the single HoxA clusters of human and horn shark. We compared the intergenic sequences of the HoxA clusters of human, horn shark, zebrafish (Aa, Ab), and striped bass and found extensive conservation of noncoding sequence motifs, i.e., phylogenetic footprints, between the human and horn shark, representing two of the three gnathostome lineages. These are putative cis-regulatory elements that may play a role in the regulation of the ancestral HoxA cluster. In contrast, homologous regions of the duplicated HoxAa and HoxAb clusters of zebrafish and the HoxA cluster of striped bass revealed a striking loss of conservation of these putative cis-regulatory sequences in the 3′ (anterior) segment of the cluster, where zebrafish only retains single representatives of group 1, 3, 4, and 5 (HoxAa) and group 2 (HoxAb) genes and in the 5′ part of the clusters, where zebrafish retains two copies of the group 13, 11, and 9 genes, i.e., AbdB-like genes. In analyzing patterns of cis-sequence evolution in the 5′ part of the clusters, we explicitly looked for evidence of complementary loss of conserved noncoding sequences, as predicted by the duplication-degeneration-complementation model in which genetic redundancy after gene duplication is resolved because of the fixation of complementary degenerative mutations. Our data did not yield evidence supporting this prediction. We conclude that changes in the pattern of cis-sequence conservation after Hox cluster duplication are more consistent with being the outcome of adaptive modification rather than passive mechanisms that erode redundancy created by the duplication event. These results support the view that genome duplications may provide a mechanism whereby master control genes undergo radical modifications conducive to major alterations in body plan. Such genomic revolutions may contribute significantly to the evolutionary process. PMID:11943847
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buchman, A.R.; Kimmerly, W.J.; Rine, J.
1988-01-01
Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript
Rose, Dominic; Stadler, Peter F.
2011-01-01
Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
Dziewit, Lukasz; Oscik, Karolina; Bartosik, Dariusz
2014-01-01
ABSTRACT ΦLM21 is a temperate phage isolated from Sinorhizobium sp. strain LM21 (Alphaproteobacteria). Genomic analysis and electron microscopy suggested that ΦLM21 is a member of the family Siphoviridae. The phage has an isometric head and a long noncontractile tail. The genome of ΦLM21 has 50,827 bp of linear double-stranded DNA encoding 72 putative proteins, including proteins responsible for the assembly of the phage particles, DNA packaging, transcription, replication, and lysis. Virion proteins were characterized using mass spectrometry, leading to the identification of the major capsid and tail components, tape measure, and a putative portal protein. We have confirmed the activity of two gene products, a lytic enzyme (a putative chitinase) and a DNA methyltransferase, sharing sequence specificity with the cell cycle-regulating methyltransferase (CcrM) of the bacterial host. Interestingly, the genome of Sinorhizobium phage ΦLM21 shows very limited similarity to other known phage genome sequences and is thus considered unique. IMPORTANCE Prophages are known to play an important role in the genomic diversification of bacteria via horizontal gene transfer. The influence of prophages on pathogenic bacteria is very well documented. However, our knowledge of the overall impact of prophages on the survival of their lysogenic, nonpathogenic bacterial hosts is still limited. In particular, information on prophages of the agronomically important Sinorhizobium species is scarce. In this study, we describe the isolation and molecular characterization of a novel temperate bacteriophage, ΦLM21, of Sinorhizobium sp. LM21. Since we have not found any similar sequences, we propose that this bacteriophage is a novel species. We conducted a functional analysis of selected proteins. We have demonstrated that the phage DNA methyltransferase has the same sequence specificity as the cell cycle-regulating methyltransferase CcrM of its host. We point out that this phenomenon of mimicking the host regulatory mechanisms by viruses is quite common in bacteriophages. PMID:25187538
Hay, Elizabeth Anne; Khalaf, Abdulla Razak; Marini, Pietro; Brown, Andrew; Heath, Karyn; Sheppard, Darrin; MacKenzie, Alasdair
2017-08-01
We have successfully used comparative genomics to identify putative regulatory elements within the human genome that contribute to the tissue specific expression of neuropeptides such as galanin and receptors such as CB1. However, a previous inability to rapidly delete these elements from the mouse genome has prevented optimal assessment of their function in-vivo. This has been solved using CAS9/CRISPR genome editing technology which uses a bacterial endonuclease called CAS9 that, in combination with specifically designed guide RNA (gRNA) molecules, cuts specific regions of the mouse genome. However, reports of "off target" effects, whereby the CAS9 endonuclease is able to cut sites other than those targeted, limits the appeal of this technology. We used cytoplasmic microinjection of gRNA and CAS9 mRNA into 1-cell mouse embryos to rapidly generate enhancer knockout mouse lines. The current study describes our analysis of the genomes of these enhancer knockout lines to detect possible off-target effects. Bioinformatic analysis was used to identify the most likely putative off-target sites and to design PCR primers that would amplify these sequences from genomic DNA of founder enhancer deletion mouse lines. Amplified DNA was then sequenced and blasted against the mouse genome sequence to detect off-target effects. Using this approach we were unable to detect any evidence of off-target effects in the genomes of three founder lines using any of the four gRNAs used in the analysis. This study suggests that the problem of off-target effects in transgenic mice have been exaggerated and that CAS9/CRISPR represents a highly effective and accurate method of deleting putative neuropeptide gene enhancer sequences from the mouse genome. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest.
Ismail, Hamid D; Jones, Ahoi; Kim, Jung H; Newman, Robert H; Kc, Dukka B
2016-01-01
Protein phosphorylation is one of the most widespread regulatory mechanisms in eukaryotes. Over the past decade, phosphorylation site prediction has emerged as an important problem in the field of bioinformatics. Here, we report a new method, termed Random Forest-based Phosphosite predictor 2.0 (RF-Phos 2.0), to predict phosphorylation sites given only the primary amino acid sequence of a protein as input. RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos 2.0 compares favorably to other popular mammalian phosphosite prediction methods, such as PhosphoSVM, GPS2.1, and Musite.
Prebiotics: why definitions matter
Hutkins, Robert W; Krumbeck, Janina A; Bindels, Laure B; Cani, Patrice D; Fahey, George; Goh, Yong Jun; Hamaker, Bruce; Martens, Eric C; Mills, David A; Rastal, Robert A; Vaughan, Elaine; Sanders, Mary Ellen
2015-01-01
The prebiotic concept was introduced twenty years ago, and despite several revisions to the original definition, the scientific community has continued to debate what it means to be a prebiotic. How prebiotics are defined is important not only for the scientific community, but also for regulatory agencies, the food industry, consumers and healthcare professionals. Recent developments in community-wide sequencing and glycomics have revealed that more complex interactions occur between putative prebiotic substrates and the gut microbiota than previously considered. A consensus among scientists on the most appropriate definition of a prebiotic is necessary to enable continued use of the term. PMID:26431716
2017-10-01
at cis-regulatory elements such as enhancers to facilitate gene transcription. CRISPR /Cas9- mediated ablation of a putative Meis1 enhancer carrying...Tables S4 and S5. 10 Cancer Cell 30, 1–16, July 11, 2016the CRISPR /Cas9-based genomic editing technology. Cas9 and a pair of single guide RNAs (sgRNA... CRISPR /Cas9-mediated deletio sgMeis1, a pair of sgRNAs that target the DMR boundaries. (N) Sequencing of the genomic PCR products from F2/R2 primers shows
Donaldson, Michael E; Rico, Yessica; Hueffer, Karsten; Rando, Halie M; Kukekova, Anna V; Kyle, Christopher J
2018-01-01
Pathogens are recognized as major drivers of local adaptation in wildlife systems. By determining which gene variants are favored in local interactions among populations with and without disease, spatially explicit adaptive responses to pathogens can be elucidated. Much of our current understanding of host responses to disease comes from a small number of genes associated with an immune response. High-throughput sequencing (HTS) technologies, such as genotype-by-sequencing (GBS), facilitate expanded explorations of genomic variation among populations. Hybridization-based GBS techniques can be leveraged in systems not well characterized for specific variants associated with disease outcome to "capture" specific genes and regulatory regions known to influence expression and disease outcome. We developed a multiplexed, sequence capture assay for red foxes to simultaneously assess ~300-kbp of genomic sequence from 116 adaptive, intrinsic, and innate immunity genes of predicted adaptive significance and their putative upstream regulatory regions along with 23 neutral microsatellite regions to control for demographic effects. The assay was applied to 45 fox DNA samples from Alaska, where three arctic rabies strains are geographically restricted and endemic to coastal tundra regions, yet absent from the boreal interior. The assay provided 61.5% on-target enrichment with relatively even sequence coverage across all targeted loci and samples (mean = 50×), which allowed us to elucidate genetic variation across introns, exons, and potential regulatory regions (4,819 SNPs). Challenges remained in accurately describing microsatellite variation using this technique; however, longer-read HTS technologies should overcome these issues. We used these data to conduct preliminary analyses and detected genetic structure in a subset of red fox immune-related genes between regions with and without endemic arctic rabies. This assay provides a template to assess immunogenetic variation in wildlife disease systems.
Evidence of function for conserved noncoding sequences in Arabidopsis thaliana.
Spangler, Jacob B; Subramaniam, Sabarinath; Freeling, Michael; Feltus, F Alex
2012-01-01
• Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.
Stapf, Christopher; Cartwright, Edward; Bycroft, Mark; Hofmann, Kay; Buchberger, Alexander
2011-01-01
Cellular functions of the essential, ubiquitin-selective AAA ATPase p97/valosin-containing protein (VCP) are controlled by regulatory cofactors determining substrate specificity and fate. Most cofactors bind p97 through a ubiquitin regulatory X (UBX) or UBX-like domain or linear sequence motifs, including the hitherto ill defined p97/VCP-interacting motif (VIM). Here, we present the new, minimal consensus sequence RX5AAX2R as a general definition of the VIM that unites a novel family of known and putative p97 cofactors, among them UBXD1 and ZNF744/ANKZF1. We demonstrate that this minimal VIM consensus sequence is necessary and sufficient for p97 binding. Using NMR chemical shift mapping, we identified several residues of the p97 N-terminal domain (N domain) that are critical for VIM binding. Importantly, we show that cellular stress resistance conferred by the yeast VIM-containing cofactor Vms1 depends on the physical interaction between its VIM and the critical N domain residues of the yeast p97 homolog, Cdc48. Thus, the VIM-N domain interaction characterized in this study is required for the physiological function of Vms1 and most likely other members of the newly defined VIM family of cofactors. PMID:21896481
Fernández, Cecilia S; Bruque, Carlos D; Taboas, Melisa; Buzzalino, Noemí D; Espeche, Lucia D; Pasqualini, Titania; Charreau, Eduardo H; Alba, Liliana G; Ghiringhelli, Pablo D; Dain, Liliana
2015-09-01
The aim of the current study was to search for the presence of genetic variants in the CYP21A2 Z promoter regulatory region in patients with congenital adrenal hyperplasia due to 21-hydroxylase deficiency. Screening of the 10 most frequent pseudogene-derived mutations was followed by direct sequencing of the entire coding sequence, the proximal promoter, and a distal regulatory region in DNA samples from patients with at least one non-determined allele. We report three non-classical patients that presented a novel genetic variant-g.15626A>G-within the Z promoter regulatory region. In all the patients, the novel variant was found in cis with the mild, less frequent, p.P482S mutation located in the exon 10 of the CYP21A2 gene. The putative pathogenic implication of the novel variant was assessed by in silico analyses and in vitro assays. Topological analyses showed differences in the curvature and bendability of the DNA region bearing the novel variant. By performing functional studies, a significantly decreased activity of a reporter gene placed downstream from the regulatory region was found by the G transition. Our results may suggest that the activity of an allele bearing the p.P482S mutation may be influenced by the misregulated CYP21A2 transcriptional activity exerted by the Z promoter A>G variation.
Analysis of xylem formation in pine by cDNA sequencing
NASA Technical Reports Server (NTRS)
Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;
1998-01-01
Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando
2014-01-01
Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. PMID:25268582
Perdomo-Sabogal, Alvaro; Nowick, Katja; Piccini, Ilaria; Sudbrak, Ralf; Lehrach, Hans; Yaspo, Marie-Laure; Warnatz, Hans-Jörg; Querfurth, Robert
2016-01-01
A substantial fraction of phenotypic differences between closely related species are likely caused by differences in gene regulation. While this has already been postulated over 30 years ago, only few examples of evolutionary changes in gene regulation have been verified. Here, we identified and investigated binding sites of the transcription factor GA-binding protein alpha (GABPa) aiming to discover cis-regulatory adaptations on the human lineage. By performing chromatin immunoprecipitation-sequencing experiments in a human cell line, we found 11,619 putative GABPa binding sites. Through sequence comparisons of the human GABPa binding regions with orthologous sequences from 34 mammals, we identified substitutions that have resulted in 224 putative human-specific GABPa binding sites. To experimentally assess the transcriptional impact of those substitutions, we selected four promoters for promoter-reporter gene assays using human and African green monkey cells. We compared the activities of wild-type promoters to mutated forms, where we have introduced one or more substitutions to mimic the ancestral state devoid of the GABPa consensus binding sequence. Similarly, we introduced the human-specific substitutions into chimpanzee and macaque promoter backgrounds. Our results demonstrate that the identified substitutions are functional, both in human and nonhuman promoters. In addition, we performed GABPa knock-down experiments and found 1,215 genes as strong candidates for primary targets. Further analyses of our data sets link GABPa to cognitive disorders, diabetes, KRAB zinc finger (KRAB-ZNF), and human-specific genes. Thus, we propose that differences in GABPa binding sites played important roles in the evolution of human-specific phenotypes. PMID:26814189
Repeated divergent selection on pigmentation genes in a rapid finch radiation
Campagna, Leonardo; Repenning, Márcio; Silveira, Luís Fábio; Fontana, Carla Suertegaray; Tubaro, Pablo L.; Lovette, Irby J.
2017-01-01
Instances of recent and rapid speciation are suitable for associating phenotypes with their causal genotypes, especially if gene flow homogenizes areas of the genome that are not under divergent selection. We study a rapid radiation of nine sympatric bird species known as capuchino seedeaters, which are differentiated in sexually selected characters of male plumage and song. We sequenced the genomes of a phenotypically diverse set of species to search for differentiated genomic regions. Capuchinos show differences in a small proportion of their genomes, yet selection has acted independently on the same targets in different members of this radiation. Many divergent regions contain genes involved in the melanogenesis pathway, with the strongest signal originating from putative regulatory regions. Selection has acted on these same genomic regions in different lineages, likely shaping the evolution of cis-regulatory elements, which control how more conserved genes are expressed and thereby generate diversity in classically sexually selected traits. PMID:28560331
Decoding sORF translation - from small proteins to gene regulation.
Cabrera-Quio, Luis Enrique; Herberg, Sarah; Pauli, Andrea
2016-11-01
Translation is best known as the fundamental mechanism by which the ribosome converts a sequence of nucleotides into a string of amino acids. Extensive research over many years has elucidated the key principles of translation, and the majority of translated regions were thought to be known. The recent discovery of wide-spread translation outside of annotated protein-coding open reading frames (ORFs) came therefore as a surprise, raising the intriguing possibility that these newly discovered translated regions might have unrecognized protein-coding or gene-regulatory functions. Here, we highlight recent findings that provide evidence that some of these newly discovered translated short ORFs (sORFs) encode functional, previously missed small proteins, while others have regulatory roles. Based on known examples we will also speculate about putative additional roles and the potentially much wider impact that these translated regions might have on cellular homeostasis and gene regulation.
He, Hongjuan; Xiu, Youcheng; Guo, Jing; Liu, Hui; Liu, Qi; Zeng, Tiebo; Chen, Yan; Zhang, Yan; Wu, Qiong
2013-01-01
Long non-coding RNAs (lncRNAs) as a key group of non-coding RNAs have gained widely attention. Though lncRNAs have been functionally annotated and systematic explored in higher mammals, few are under systematical identification and annotation. Owing to the expression specificity, known lncRNAs expressed in embryonic brain tissues remain still limited. Considering a large number of lncRNAs are only transcribed in brain tissues, studies of lncRNAs in developmental brain are therefore of special interest. Here, publicly available RNA-sequencing (RNA-seq) data in embryonic brain are integrated to identify thousands of embryonic brain lncRNAs by a customized pipeline. A significant proportion of novel transcripts have not been annotated by available genomic resources. The putative embryonic brain lncRNAs are shorter in length, less spliced and show less conservation than known genes. The expression of putative lncRNAs is in one tenth on average of known coding genes, while comparable with known lncRNAs. From chromatin data, putative embryonic brain lncRNAs are associated with active chromatin marks, comparable with known lncRNAs. Embryonic brain expressed lncRNAs are also indicated to have expression though not evident in adult brain. Gene Ontology analysis of putative embryonic brain lncRNAs suggests that they are associated with brain development. The putative lncRNAs are shown to be related to possible cis-regulatory roles in imprinting even themselves are deemed to be imprinted lncRNAs. Re-analysis of one knockdown data suggests that four regulators are associated with lncRNAs. Taken together, the identification and systematic analysis of putative lncRNAs would provide novel insights into uncharacterized mouse non-coding regions and the relationships with mammalian embryonic brain development. PMID:23967161
Characterization of carotenoid hydroxylase gene promoter in Haematococcus pluvialis.
Meng, C X; Wei, W; Su, Z- L; Qin, S
2006-10-01
Astaxanthin, a high-value ketocarotenoid is mainly used in fish aquaculture. It also has potential in human health due to its higher antioxidant capacity than beta-carotene and vitamin E. The unicellular green alga Haematococcus pluvialis is known to accumulate astaxanthin in response to environmental stresses, such as high light intensity and salt stress. Carotenoid hydroxylase plays a key role in astaxanthin biosynthesis in H. pluvialis. In this paper, we report the characterization of a promoter-like region (-378 to -22 bp) of carotenoid hydroxylase gene by cloning, sequence analysis and functional verification of its 919 bp 5'-flanking region in H. pluvialis. The 5'-flanking region was characterized using micro-particle bombardment method and transient expression of LacZ reporter gene. Results of sequence analysis showed that the 5'-flanking region might have putative cis-acting elements, such as ABA (abscisic acid)-responsive element (ABRE), C-repeat/dehydration responsive element (C-repeat/DRE), ethylene-responsive element (ERE), heat-shock element (HSE), wound-responsive element (WUN-motif), gibberellin-responsive element (P-box), MYB-binding site (MBS) etc., except for typical TATA and CCAAT boxes. Results of 5' deletions construct and beta-galactosidase assays revealed that a highest promoter-like region might exist from -378 to -22 bp and some negative regulatory elements might lie in the region from -919 to -378 bp. Results of site-directed mutagenesis of a putative C-repeat/DRE and an ABRE-like motif in the promoter-like region (-378 to -22 bp) indicated that the putative C-repeat/DRE and ABRE-like motif might be important for expression of carotenoid hydroxylase gene.
Egan, Sharon A.; Ward, Philip N.; Watson, Michael; Field, Terence R.
2012-01-01
The regulation and control of gene expression in response to differing environmental stimuli is crucial for successful pathogen adaptation and persistence. The regulatory gene vru of Streptococcus uberis encodes a stand-alone response regulator with similarity to the Mga of group A Streptococcus. Mga controls expression of a number of important virulence determinants. Experimental intramammary challenge of dairy cattle with a mutant of S. uberis carrying an inactivating lesion in vru showed reduced ability to colonize the mammary gland and an inability to induce clinical signs of mastitis compared with the wild-type strain. Analysis of transcriptional differences of gene expression in the mutant, determined by microarray analysis, identified a number of coding sequences with altered expression in the absence of Vru. These consisted of known and putative virulence determinants, including Lbp (Sub0145), SclB (Sub1095), PauA (Sub1785) and hasA (Sub1696). PMID:22383474
Honda, Takashi; Morimoto, Daichi; Sako, Yoshihiko; Yoshida, Takashi
2018-05-17
Previously, we showed that DNA replication and cell division in toxic cyanobacterium Microcystis aeruginosa are coordinated by transcriptional regulation of cell division gene ftsZ and that an unknown protein specifically bound upstream of ftsZ (BpFz; DNA-binding protein to an upstream site of ftsZ) during successful DNA replication and cell division. Here, we purified BpFz from M. aeruginosa strain NIES-298 using DNA-affinity chromatography and gel-slicing combined with gel electrophoresis mobility shift assay (EMSA). The N-terminal amino acid sequence of BpFz was identified as TNLESLTQ, which was identical to that of transcription repressor LexA from NIES-843. EMSA analysis using mutant probes showed that the sequence GTACTAN 3 GTGTTC was important in LexA binding. Comparison of the upstream regions of lexA in the genomes of closely related cyanobacteria suggested that the sequence TASTRNNNNTGTWC could be a putative LexA recognition sequence (LexA box). Searches for TASTRNNNNTGTWC as a transcriptional regulatory site (TRS) in the genome of M. aeruginosa NIES-843 showed that it was present in genes involved in cell division, photosynthesis, and extracellular polysaccharide biosynthesis. Considering that BpFz binds to the TRS of ftsZ during normal cell division, LexA may function as a transcriptional activator of genes related to cell reproduction in M. aeruginosa, including ftsZ. This may be an example of informality in the control of bacterial cell division.
Ordóñez-Baquera, Perla Lucía; González-Rodríguez, Everardo; Aguado-Santacruz, Gerardo Armando; Rascón-Cruz, Quintín; Conesa, Ana; Moreno-Brito, Verónica; Echavarria, Raquel; Dominguez-Viveros, Joel
2017-02-01
MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate signal transduction, development, metabolism, and stress responses in plants through post-transcriptional degradation and/or translational repression of target mRNAs. Several studies have addressed the role of miRNAs in model plant species, but miRNA expression and function in economically important forage crops, such as Bouteloua gracilis (Poaceae), a high-quality and drought-resistant grass distributed in semiarid regions of the United States and northern Mexico remain unknown. We applied high-throughput sequencing technology and bioinformatics analysis and identified 31 conserved miRNA families and 53 novel putative miRNAs with different abundance of reads in chlorophyllic cell cultures derived from B. gracilis. Some conserved miRNA families were highly abundant and possessed predicted targets involved in metabolism, plant growth and development, and stress responses. We also predicted additional identified novel miRNAs with specific targets, including B. gracilis ESTs, which were detected under drought stress conditions. Here we report 31 conserved miRNA families and 53 putative novel miRNAs in B. gracilis. Our results suggested the presence of regulatory miRNAs involved in modulating physiological and stress responses in this grass species. Copyright © 2016 Elsevier Ltd. All rights reserved.
Choi, Younho; Kim, Seongok; Hwang, Hyelyeon; Kim, Kwang-Pyo; Kang, Dong-Hyun
2014-01-01
The aim of this study was to elucidate the function of the plasmid-borne mcp (methyl-accepting chemotaxis protein) gene, which plays pleiotropic roles in Cronobacter sakazakii ATCC 29544. By searching for virulence factors using a random transposon insertion mutant library, we identified and sequenced a new plasmid, pCSA2, in C. sakazakii ATCC 29544. An in silico analysis of pCSA2 revealed that it included six putative open reading frames, and one of them was mcp. The mcp mutant was defective for invasion into and adhesion to epithelial cells, and the virulence of the mcp mutant was attenuated in rat pups. In addition, we demonstrated that putative MCP regulates the motility of C. sakazakii, and the expression of the flagellar genes was enhanced in the absence of a functional mcp gene. Furthermore, a lack of the mcp gene also impaired the ability of C. sakazakii to form a biofilm. Our results demonstrate a regulatory role for MCP in diverse biological processes, including the virulence of C. sakazakii ATCC 29544. To the best of our knowledge, this study is the first to elucidate a potential function of a plasmid-encoded MCP homolog in the C. sakazakii sequence type 8 (ST8) lineage. PMID:25332122
Choi, Younho; Kim, Seongok; Hwang, Hyelyeon; Kim, Kwang-Pyo; Kang, Dong-Hyun; Ryu, Sangryeol
2015-01-01
The aim of this study was to elucidate the function of the plasmid-borne mcp (methyl-accepting chemotaxis protein) gene, which plays pleiotropic roles in Cronobacter sakazakii ATCC 29544. By searching for virulence factors using a random transposon insertion mutant library, we identified and sequenced a new plasmid, pCSA2, in C. sakazakii ATCC 29544. An in silico analysis of pCSA2 revealed that it included six putative open reading frames, and one of them was mcp. The mcp mutant was defective for invasion into and adhesion to epithelial cells, and the virulence of the mcp mutant was attenuated in rat pups. In addition, we demonstrated that putative MCP regulates the motility of C. sakazakii, and the expression of the flagellar genes was enhanced in the absence of a functional mcp gene. Furthermore, a lack of the mcp gene also impaired the ability of C. sakazakii to form a biofilm. Our results demonstrate a regulatory role for MCP in diverse biological processes, including the virulence of C. sakazakii ATCC 29544. To the best of our knowledge, this study is the first to elucidate a potential function of a plasmid-encoded MCP homolog in the C. sakazakii sequence type 8 (ST8) lineage. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Comparative genomics of 9 novel Paenibacillus larvae bacteriophages
Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.
2016-01-01
ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559
Physiological and molecular characterization of genetic competence in Streptococcus sanguinis.
Rodriguez, A M; Callahan, J E; Fawcett, P; Ge, X; Xu, P; Kitten, T
2011-04-01
Streptococcus sanguinis is a major component of the oral flora and an important cause of infective endocarditis. Although S. sanguinis is naturally competent, genome sequencing has suggested significant differences in the S. sanguinis competence system relative to those of other streptococci. An S. sanguinis mutant possessing an in-frame deletion in the comC gene, which encodes competence-stimulating peptide (CSP), was created. Addition of synthetic CSP induced competence in this strain. Gene expression in this strain was monitored by microarray analysis at multiple time-points from 2.5 to 30 min after CSP addition, and verified by quantitative reverse transcription-polymerase chain reaction. Over 200 genes were identified whose expression was altered at least two-fold in at least one time point, with the majority upregulated. The 'late' response was typical of that seen in previous studies. However, comparison of the 'early' response in S. sanguinis with that of other oral streptococci revealed unexpected differences with regard to the number of genes induced, the nature of those genes, and their putative upstream regulatory sequences. Streptococcus sanguinis possesses a comparatively limited early response, which may define a minimal streptococcal competence regulatory circuit. © 2011 John Wiley & Sons A/S.
Physiological and molecular characterization of genetic competence in Streptococcus sanguinis
Rodriguez, Alejandro Miguel; Callahan, Jill E.; Fawcett, Paul; Ge, Xiuchun; Xu, Ping; Kitten, Todd
2011-01-01
SUMMARY Streptococcus sanguinis is a major component of the oral flora and an important cause of infective endocarditis. Although S. sanguinis is naturally competent, genome sequencing has suggested significant differences in the S. sanguinis competence system relative to those of other streptococci. An S. sanguinis mutant possessing an in-frame deletion in the comC gene, which encodes competence-stimulating peptide (CSP), was created. Addition of synthetic CSP induced competence in this strain. Gene expression in this strain was monitored by microarray analysis at multiple time points from 2.5 to 30 min after CSP addition, and verified by quantitative RT-PCR. Over 200 genes were identified whose expression was altered at least two-fold in at least one time point, with the majority upregulated. The “late” response was typical of that seen in previous studies. However, comparison of the “early” response in S. sanguinis with that of other oral streptococci revealed unexpected differences with regard to the number of genes induced, the nature of these genes, and their putative upstream regulatory sequences. S. sanguinis possesses a comparatively limited early response, which may define a minimal streptococcal competence regulatory circuit. PMID:21375701
Peoples, R J; Cisco, M J; Kaplan, P; Francke, U
1998-01-01
We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
2010-01-01
Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985
RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.
Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle
2016-01-01
Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation.
Oakley, Brian B; Line, J Eric; Berrang, Mark E; Johnson, Jessica M; Buhr, R Jeff; Cox, Nelson A; Hiett, Kelli L; Seal, Bruce S
2012-02-01
Although Campylobacter is an important food-borne human pathogen, there remains a lack of molecular diagnostic assays that are simple to use, cost-effective, and provide rapid results in research, clinical, or regulatory laboratories. Of the numerous Campylobacter assays that do exist, to our knowledge none has been empirically tested for specificity using high-throughput sequencing. Here we demonstrate the power of next-generation sequencing to determine the specificity of a widely cited Campylobacter-specific polymerase chain reaction (PCR) assay and describe a rapid method for direct cell suspension PCR to quickly and easily screen samples for Campylobacter. We present a specific protocol which eliminates the need for time-consuming and expensive genomic DNA extractions and, using a high-processivity polymerase, demonstrate conclusive screening of samples in <1 h. Pyrosequencing results show the assay to be extremely (>99%) sensitive, and spike-back experiments demonstrated a detection threshold of <10(2) CFU mL(-1). Additionally, we present 2 newly designed broad-range bacterial primer sets targeting the 23S rRNA gene that have wide applicability as internal amplification controls. Empirical testing of putative taxon-specific assays using high-throughput sequencing is an important validation step that is now financially feasible for research, regulatory, or clinical applications. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aklujkar, Muktak; Krushkal, Julia; DiBartolo, Genevieve
Background. The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results. The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recentlymore » in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion. The genomic evidence suggests that metabolism, physiology Background. The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results. The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion. The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae.« less
Lü, Peitao; Liu, Jitao; Gao, Junping; Zhang, Changqing
2014-01-01
Plant transcription factors involved in stress responses are generally classified by their involvement in either the abscisic acid (ABA)-dependent or the ABA-independent regulatory pathways. A stress-associated NAC gene from rose (Rosa hybrida), RhNAC3, was previously found to increase dehydration tolerance in both rose and Arabidopsis. However, the regulatory mechanism involved in RhNAC3 action is still not fully understood. In this study, we isolated and analyzed the upstream regulatory sequence of RhNAC3 and found many stress-related cis-elements to be present in the promoter, with five ABA-responsive element (ABRE) motifs being of particular interest. Characterization of Arabidopsis thaliana plants transformed with the putative RhNAC3 promoter sequence fused to the β-glucuronidase (GUS) reporter gene revealed that RhNAC3 is expressed at high basal levels in leaf guard cells and in vascular tissues. Moreover, the ABRE motifs in the RhNAC3 promoter were observed to have a cumulative effect on the transcriptional activity of this gene both in the presence and absence of exogenous ABA. Overexpression of RhNAC3 in A. thaliana resulted in ABA hypersensitivity during seed germination and promoted leaf closure after ABA or drought treatments. Additionally, the expression of 11 ABA-responsive genes was induced to a greater degree by dehydration in the transgenic plants overexpressing RhNAC3 than control lines transformed with the vector alone. Further analysis revealed that all these genes contain NAC binding cis-elements in their promoter regions, and RhNAC3 was found to partially bind to these putative NAC recognition sites. We further found that of 219 A. thaliana genes previously shown by microarray analysis to be regulated by heterologous overexpression RhNAC3, 85 are responsive to ABA. In rose, the expression of genes downstream of the ABA-signaling pathways was also repressed in RhNAC3-silenced petals. Taken together, we propose that the rose RhNAC3 protein could mediate ABA signaling both in rose and in A. thaliana. PMID:25290154
Jiang, Guimei; Jiang, Xinqiang; Lü, Peitao; Liu, Jitao; Gao, Junping; Zhang, Changqing
2014-01-01
Plant transcription factors involved in stress responses are generally classified by their involvement in either the abscisic acid (ABA)-dependent or the ABA-independent regulatory pathways. A stress-associated NAC gene from rose (Rosa hybrida), RhNAC3, was previously found to increase dehydration tolerance in both rose and Arabidopsis. However, the regulatory mechanism involved in RhNAC3 action is still not fully understood. In this study, we isolated and analyzed the upstream regulatory sequence of RhNAC3 and found many stress-related cis-elements to be present in the promoter, with five ABA-responsive element (ABRE) motifs being of particular interest. Characterization of Arabidopsis thaliana plants transformed with the putative RhNAC3 promoter sequence fused to the β-glucuronidase (GUS) reporter gene revealed that RhNAC3 is expressed at high basal levels in leaf guard cells and in vascular tissues. Moreover, the ABRE motifs in the RhNAC3 promoter were observed to have a cumulative effect on the transcriptional activity of this gene both in the presence and absence of exogenous ABA. Overexpression of RhNAC3 in A. thaliana resulted in ABA hypersensitivity during seed germination and promoted leaf closure after ABA or drought treatments. Additionally, the expression of 11 ABA-responsive genes was induced to a greater degree by dehydration in the transgenic plants overexpressing RhNAC3 than control lines transformed with the vector alone. Further analysis revealed that all these genes contain NAC binding cis-elements in their promoter regions, and RhNAC3 was found to partially bind to these putative NAC recognition sites. We further found that of 219 A. thaliana genes previously shown by microarray analysis to be regulated by heterologous overexpression RhNAC3, 85 are responsive to ABA. In rose, the expression of genes downstream of the ABA-signaling pathways was also repressed in RhNAC3-silenced petals. Taken together, we propose that the rose RhNAC3 protein could mediate ABA signaling both in rose and in A. thaliana.
Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis
2012-01-01
Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252
2009-01-01
Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg
2009-08-06
Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan
2018-03-01
In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.
Bowman, Megan J.; Park, Wonkeun; Bauer, Philip J.; Udall, Joshua A.; Page, Justin T.; Raney, Joshua; Scheffler, Brian E.; Jones, Don. C.; Campbell, B. Todd
2013-01-01
An RNA-Seq experiment was performed using field grown well-watered and naturally rain fed cotton plants to identify differentially expressed transcripts under water-deficit stress. Our work constitutes the first application of the newly published diploid D5 Gossypium raimondii sequence in the study of tetraploid AD1 upland cotton RNA-seq transcriptome analysis. A total of 1,530 transcripts were differentially expressed between well-watered and water-deficit stressed root tissues, in patterns that confirm the accuracy of this technique for future studies in cotton genomics. Additionally, putative sequence based genome localization of differentially expressed transcripts detected A2 genome specific gene expression under water-deficit stress. These data will facilitate efforts to understand the complex responses governing transcriptomic regulatory mechanisms and to identify candidate genes that may benefit applied plant breeding programs. PMID:24324815
de Souza, C R; Aragão, F J; Moreira, E C O; Costa, C N M; Nascimento, S B; Carvalho, L J
2009-03-24
Cassava is one of the most important tropical food crops for more than 600 million people worldwide. Transgenic technologies can be useful for increasing its nutritional value and its resistance to viral diseases and insect pests. However, tissue-specific promoters that guarantee correct expression of transgenes would be necessary. We used inverse polymerase chain reaction to isolate a promoter sequence of the Mec1 gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in cassava storage roots. In silico analysis revealed putative cis-acting regulatory elements within this promoter sequence, including root-specific elements that may be required for its expression in vascular tissues. Transient expression experiments showed that the Mec1 promoter is functional, since this sequence was able to drive GUS expression in bean embryonic axes. Results from our computational analysis can serve as a guide for functional experiments to identify regions with tissue-specific Mec1 promoter activity. The DNA sequence that we identified is a new promoter that could be a candidate for genetic engineering of cassava roots.
Fukumori, F; Saint, C P
1997-01-01
A 9,233-bp HindIII fragment of the aromatic amine catabolic plasmid pTDN1, isolated from a derivative of Pseudomonas putida mt-2 (UCC22), confers the ability to degrade aniline on P. putida KT2442. The fragment encodes six open reading frames which are arranged in the same direction. Their 5' upstream region is part of the direct-repeat sequence of pTDN1. Nucleotide sequence of 1.8 kb of the repeat sequence revealed only a single base pair change compared to the known sequence of IS1071 which is involved in the transposition of the chlorobenzoate genes (C. Nakatsu, J. Ng, R. Singh, N. Straus, and C. Wyndham, Proc. Natl. Acad. Sci. USA 88:8312-8316, 1991). Four open reading frames encode proteins with considerable homology to proteins found in other aromatic-compound degradation pathways. On the basis of sequence similarity, these genes are proposed to encode the large and small subunits of aniline oxygenase (tdnA1 and tdnA2, respectively), a reductase (tdnB), and a LysR-type regulatory gene (tdnR). The putative large subunit has a conserved [2Fe-2S]R Rieske-type ligand center. Two genes, tdnQ and tdnT, which may be involved in amino group transfer, are localized upstream of the putative oxygenase genes. The tdnQ gene product shares about 30% similarity with glutamine synthetases; however, a pUC-based plasmid carrying tdnQ did not support the growth of an Escherichia coli glnA strain in the absence of glutamine. TdnT possesses domains that are conserved among amidotransferases. The tdnQ, tdnA1, tdnA2, tdnB, and tdnR genes are essential for the conversion of aniline to catechol. PMID:8990291
Transcription Factors Bind Thousands of Active and InactiveRegions in the Drosophila Blastoderm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xiao-Yong; MacArthur, Stewart; Bourgon, Richard
2008-01-10
Identifying the genomic regions bound by sequence-specific regulatory factors is central both to deciphering the complex DNA cis-regulatory code that controls transcription in metazoans and to determining the range of genes that shape animal morphogenesis. Here, we use whole-genome tiling arrays to map sequences bound in Drosophila melanogaster embryos by the six maternal and gap transcription factors that initiate anterior-posterior patterning. We find that these sequence-specific DNA binding proteins bind with quantitatively different specificities to highly overlapping sets of several thousand genomic regions in blastoderm embryos. Specific high- and moderate-affinity in vitro recognition sequences for each factor are enriched inmore » bound regions. This enrichment, however, is not sufficient to explain the pattern of binding in vivo and varies in a context-dependent manner, demonstrating that higher-order rules must govern targeting of transcription factors. The more highly bound regions include all of the over forty well-characterized enhancers known to respond to these factors as well as several hundred putative new cis-regulatory modules clustered near developmental regulators and other genes with patterned expression at this stage of embryogenesis. The new targets include most of the microRNAs (miRNAs) transcribed in the blastoderm, as well as all major zygotically transcribed dorsal-ventral patterning genes, whose expression we show to be quantitatively modulated by anterior-posterior factors. In addition to these highly bound regions, there are several thousand regions that are reproducibly bound at lower levels. However, these poorly bound regions are, collectively, far more distant from genes transcribed in the blastoderm than highly bound regions; are preferentially found in protein-coding sequences; and are less conserved than highly bound regions. Together these observations suggest that many of these poorly-bound regions are not involved in early-embryonic transcriptional regulation, and a significant proportion may be nonfunctional. Surprisingly, for five of the six factors, their recognition sites are not unambiguously more constrained evolutionarily than the immediate flanking DNA, even in more highly bound and presumably functional regions, indicating that comparative DNA sequence analysis is limited in its ability to identify functional transcription factor targets.« less
Khani, Afsaneh; Popp, Nicole; Kreikemeyer, Bernd; Patenge, Nadja
2018-01-01
Regulatory RNAs play important roles in the control of bacterial gene expression. In this study, we investigated gene expression regulation by a putative glycine riboswitch located in the 5'-untranslated region of a sodium:alanine symporter family (SAF) protein gene in the group A Streptococcus pyogenes serotype M49 strain 591. Glycine-dependent gene expression mediated by riboswitch activity was studied using a luciferase reporter gene system. Maximal reporter gene expression was observed in the absence of glycine and in the presence of low glycine concentrations. Differences in glycine-dependent gene expression were not based on differential promoter activity. Expression of the SAF protein gene and the downstream putative cation efflux protein gene was investigated in wild-type bacteria by RT-qPCR transcript analyses. During growth in the presence of glycine (≥1 mM), expression of the genes were downregulated. Northern blot analyses revealed premature transcription termination in the presence of high glycine concentrations. Growth in the presence of 0.1 mM glycine led to the production of a full-length transcript. Furthermore, stability of the SAF protein gene transcript was drastically reduced in the presence of glycine. We conclude that the putative glycine riboswitch in S. pyogenes serotype M49 strain 591 represses expression of the SAF protein gene and the downstream putative cation efflux protein gene in the presence of high glycine concentrations. Sequence and secondary structure comparisons indicated that the streptococcal riboswitch belongs to the class of tandem aptamer glycine riboswitches.
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K.; Fryszczyn, Bartlomiej G.; Fox, George E.; Tirumalai, Madhan R.; Liu, Yamei; Kim, Sun
2015-01-01
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. PMID:25953173
Characterization of HIV Transmission in South-East Austria
Kessler, Harald H.; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J.; Mehta, Sanjay R.
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects. PMID:26967154
Characterization of HIV Transmission in South-East Austria.
Hoenigl, Martin; Chaillon, Antoine; Kessler, Harald H; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J; Mehta, Sanjay R
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects.
Schrider, Daniel R.; Kern, Andrew D.
2015-01-01
The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods. PMID:26590212
Regulation of expression of transgenes in developing fish.
Moav, B; Liu, Z; Caldovic, L D; Gross, M L; Faras, A J; Hackett, P B
1993-05-01
The transcriptional regulatory elements of the beta-actin gene of carp (Cyprinus carpio) have been examined in zebrafish and goldfish harbouring transgenes. The high sequence conservation of the putative regulatory elements in the beta-actin genes of animals suggested that their function would be conserved, so that transgenic constructs with the same transcriptional control elements would promote similar levels of transgene expression in different species of transgenic animals. To test this assumption, we analysed the temporal expression of a reporter gene under the control of transcriptional control sequences from the carp beta-actin gene in zebrafish (Brachydanio rerio) and goldfish (Carrasius auratus). Our results indicated that, contrary to expectations, combinations of different transcriptional control elements affected the level, duration, and onset of gene expression differently in developing zebrafish and goldfish. The major differences in expression of beta-actin/CAT (chloramphenicol acetyltransferase) constructs in zebrafish and goldfish were: (1) overall expression was almost 100-fold higher in goldfish than in zebrafish embryos, (2) the first intron had an enhancing effect on gene expression in zebrafish but not in goldfish, and (3) the serum-responsive/CArG-containing regulatory element in the proximal promoter was not always required for maximal CAT activity in goldfish, but was required in zebrafish. These results suggest that in the zebrafish, but not in the goldfish, there may be interactions between motifs in the proximal promoter and the first intron which appear to be required for maximal enhancement of transcription.
Evolution of Hsp70 Gene Expression: A Role for Changes in AT-Richness within Promoters
Ma, Ronghui; Zhang, Bo; Kang, Le
2011-01-01
In disparate organisms adaptation to thermal stress has been linked to changes in the expression of genes encoding heat-shock proteins (Hsp). The underlying genetics, however, remain elusive. We show here that two AT-rich sequence elements in the promoter region of the hsp70 gene of the fly Liriomyza sativae that are absent in the congeneric species, Liriomyza huidobrensis, have marked cis-regulatory consequences. We studied the cis-regulatory consequences of these elements (called ATRS1 and ATRS2) by measuring the constitutive and heat-shock-induced luciferase luminescence that they drive in cells transfected with constructs carrying them modified, deleted, or intact, in the hsp70 promoter fused to the luciferase gene. The elements affected expression level markedly and in different ways: Deleting ATRS1 augmented both the constitutive and the heat-shock-induced luminescence, suggesting that this element represses transcription. Interestingly, replacing the element with random sequences of the same length and A+T content delivered the wild-type luminescence pattern, proving that the element's high A+T content is crucial for its effects. Deleting ATRS2 decreased luminescence dramatically and almost abolished heat-shock inducibility and so did replacing the element with random sequences matching the element's length and A+T content, suggesting that ATRS2's effects on transcription and heat-shock inducibility involve a common mechanism requiring at least in part the element's specific primary structure. Finally, constitutive and heat-shock luminescence were reduced strongly when two putative binding sites for the Zeste transcription factor identified within ATRS2 were altered through site-directed mutagenesis, and the heat-shock-induced luminescence increased when Zeste was over-expressed, indicating that Zeste participates in the effects mapped to ATRS2 at least in part. AT-rich sequences are common in promoters and our results suggest that they should play important roles in regulatory evolution since they can affect expression markedly and constrain promoter DNA in at least two different ways. PMID:21655251
Beysen, D; Raes, J; Leroy, B P; Lucassen, A; Yates, J R W; Clayton-Smith, J; Ilyina, H; Brooks, S Sklower; Christin-Maitre, S; Fellous, M; Fryns, J P; Kim, J R; Lapunzina, P; Lemyre, E; Meire, F; Messiaen, L M; Oley, C; Splitt, M; Thomson, J; Van de Peer, Y; Veitia, R A; De Paepe, A; De Baere, E
2005-08-01
The expression of a gene requires not only a normal coding sequence but also intact regulatory regions, which can be located at large distances from the target genes, as demonstrated for an increasing number of developmental genes. In previous mutation studies of the role of FOXL2 in blepharophimosis syndrome (BPES), we identified intragenic mutations in 70% of our patients. Three translocation breakpoints upstream of FOXL2 in patients with BPES suggested a position effect. Here, we identified novel microdeletions outside of FOXL2 in cases of sporadic and familial BPES. Specifically, four rearrangements, with an overlap of 126 kb, are located 230 kb upstream of FOXL2, telomeric to the reported translocation breakpoints. Moreover, the shortest region of deletion overlap (SRO) contains several conserved nongenic sequences (CNGs) harboring putative transcription-factor binding sites and representing potential long-range cis-regulatory elements. Interestingly, the human region orthologous to the 12-kb sequence deleted in the polled intersex syndrome in goat, which is an animal model for BPES, is contained in this SRO, providing evidence of human-goat conservation of FOXL2 expression and of the mutational mechanism. Surprisingly, in a fifth family with BPES, one rearrangement was found downstream of FOXL2. In addition, we report nine novel rearrangements encompassing FOXL2 that range from partial gene deletions to submicroscopic deletions. Overall, genomic rearrangements encompassing or outside of FOXL2 account for 16% of all molecular defects found in our families with BPES. In summary, this is the first report of extragenic deletions in BPES, providing further evidence of potential long-range cis-regulatory elements regulating FOXL2 expression. It contributes to the enlarging group of developmental diseases caused by defective distant regulation of gene expression. Finally, we demonstrate that CNGs are candidate regions for genomic rearrangements in developmental genes.
Beysen, D.; Raes, J.; Leroy, B. P.; Lucassen, A.; Yates, J. R. W.; Clayton-Smith, J.; Ilyina, H.; Brooks, S. Sklower; Christin-Maitre, S.; Fellous, M.; Fryns, J. P.; Kim, J. R.; Lapunzina, P.; Lemyre, E.; Meire, F.; Messiaen, L. M.; Oley, C.; Splitt, M.; Thomson, J.; Peer, Y. Van de; Veitia, R. A.; De Paepe, A.; De Baere, E.
2005-01-01
The expression of a gene requires not only a normal coding sequence but also intact regulatory regions, which can be located at large distances from the target genes, as demonstrated for an increasing number of developmental genes. In previous mutation studies of the role of FOXL2 in blepharophimosis syndrome (BPES), we identified intragenic mutations in 70% of our patients. Three translocation breakpoints upstream of FOXL2 in patients with BPES suggested a position effect. Here, we identified novel microdeletions outside of FOXL2 in cases of sporadic and familial BPES. Specifically, four rearrangements, with an overlap of 126 kb, are located 230 kb upstream of FOXL2, telomeric to the reported translocation breakpoints. Moreover, the shortest region of deletion overlap (SRO) contains several conserved nongenic sequences (CNGs) harboring putative transcription-factor binding sites and representing potential long-range cis-regulatory elements. Interestingly, the human region orthologous to the 12-kb sequence deleted in the polled intersex syndrome in goat, which is an animal model for BPES, is contained in this SRO, providing evidence of human-goat conservation of FOXL2 expression and of the mutational mechanism. Surprisingly, in a fifth family with BPES, one rearrangement was found downstream of FOXL2. In addition, we report nine novel rearrangements encompassing FOXL2 that range from partial gene deletions to submicroscopic deletions. Overall, genomic rearrangements encompassing or outside of FOXL2 account for 16% of all molecular defects found in our families with BPES. In summary, this is the first report of extragenic deletions in BPES, providing further evidence of potential long-range cis-regulatory elements regulating FOXL2 expression. It contributes to the enlarging group of developmental diseases caused by defective distant regulation of gene expression. Finally, we demonstrate that CNGs are candidate regions for genomic rearrangements in developmental genes. PMID:15962237
CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data
O'Connor, Timothy; Bodén, Mikael
2017-01-01
Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K; Fryszczyn, Bartlomiej G; Fox, George E; Tirumalai, Madhan R; Liu, Yamei; Kim, Sun; Kehoe, David M; Weinstock, George M
2015-05-07
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. Copyright © 2015 Yerrapragada et al.
Renaudin, Pauline; Janin, Alexandre; Millat, Gilles; Chevalier, Philippe
2018-04-01
Hypertrophic cardiomyopathy (HCM), a common and clinically heterogeneous disease characterized by unexplained ventricular myocardial hypertrophy, is mostly caused by mutations in sarcomeric genes. Identifying the genetic cause is important for management, therapy, and genetic counseling. A molecular diagnosis was performed on a 51-year-old woman diagnosed with HCM using a next-generation sequencing workflow based on a panel designed for sequencing the most prevalent cardiomyopathy-causing genes. Segregation analysis was performed on the woman's family. A novel myosin regulatory light chain (MYL2) missense variant, NM_000432.3:c485G>A, p.Gly162Glu, was identified and firstly considered as a putative pathogenic mutation. Among the 27 family members tested, 16 were carriers for the MYL2-p.Gly162Glu mutation, of whom 12 with the phenotype were positive. None of the 11 family members without mutation had cardiomyopathy. Genetic analysis combined with a segregation study allowed us to classify this novel MYL2 variation, p.Gly162Glu, as a novel pathogenic mutation leading to a familial form of HCM. Due to absence of fast in vitro approaches to evaluate the functional impact of missense variants on HCM-causing genes, segregation studies remain, when possible, the easiest approach to evaluate the putative pathogenicity of novel gene variants, more particularly missense ones.
PUTATIVE GENE PROMOTER SEQUENCES IN THE CHLORELLA VIRUSES
Fitzgerald, Lisa A.; Boucher, Philip T.; Yanai-Balser, Giane; Suhre, Karsten; Graves, Michael V.; Van Etten, James L.
2008-01-01
Three short (7 to 9 nucleotides) highly conserved nucleotide sequences were identified in the putative promoter regions (150 bp upstream and 50 bp downstream of the ATG translation start site) of three members of the genus Chlorovirus, family Phycodnaviridae. Most of these sequences occurred in similar locations within the defined promoter regions. The sequence and location of the motifs were often conserved among homologous ORFs within the Chlorovirus family. One of these conserved sequences (AATGACA) is predominately associated with genes expressed early in virus replication. PMID:18768195
Lee, Younghee; Han, Seonggyun; Kim, Dongwook; Kim, Dokyoon; Horgousluoglu, Emrin; Risacher, Shannon L; Saykin, Andrew J; Nho, Kwangsik
2018-01-01
Genetic variation in cis-regulatory elements related to splicing machinery and splicing regulatory elements (SREs) results in exon skipping and undesired protein products. We developed a splicing decision model to identify actionable loci among common SNPs for gene regulation. The splicing decision model identified SNPs affecting exon skipping by analyzing sequence-driven alternative splicing (AS) models and by scanning the genome for the regions with putative SRE motifs. We used non-Hispanic Caucasians with neuroimaging, and fluid biomarkers for Alzheimer's disease (AD) and identified 17,088 common exonic SNPs affecting exon skipping. GWAS identified one SNP (rs1140317) in HLA-DQB1 as significantly associated with entorhinal cortical thickness, AD neuroimaging biomarker, after controlling for multiple testing. Further analysis revealed that rs1140317 was significantly associated with brain amyloid-f deposition (PET and CSF). HLA-DQB1 is an essential immune gene and may regulate AS, thereby contributing to AD pathology. SRE may hold potential as novel therapeutic targets for AD.
Catania, Francesco; Lynch, Michael
2010-05-04
In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Enguita, Francisco J.; Costa, Marina C.; Fusco-Almeida, Ana Marisa; Mendes-Giannini, Maria José; Leitão, Ana Lúcia
2016-01-01
Fungal invasive infections are an increasing health problem. The intrinsic complexity of pathogenic fungi and the unmet clinical need for new and more effective treatments requires a detailed knowledge of the infection process. During infection, fungal pathogens are able to trigger a specific transcriptional program in their host cells. The detailed knowledge of this transcriptional program will allow for a better understanding of the infection process and consequently will help in the future design of more efficient therapeutic strategies. Simultaneous transcriptomic studies of pathogen and host by high-throughput sequencing (dual RNA-seq) is an unbiased protocol to understand the intricate regulatory networks underlying the infectious process. This protocol is starting to be applied to the study of the interactions between fungal pathogens and their hosts. To date, our knowledge of the molecular basis of infection for fungal pathogens is still very limited, and the putative role of regulatory players such as non-coding RNAs or epigenetic factors remains elusive. The wider application of high-throughput transcriptomics in the near future will help to understand the fungal mechanisms for colonization and survival, as well as to characterize the molecular responses of the host cell against a fungal infection. PMID:29376924
Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K
1989-11-01
Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, but not to other MHC cis-acting sequences or to mutant region II sequences, similar to the naturally occurring region II factor in mouse cells. The deduced amino acid sequence of H-2RIIBP revealed two putative zinc fingers homologous to the DNA-binding domain of steroid/thyroid hormone receptors. Although sequence similarity in other regions was minimal, H-2RIIBP has apparent modular domains characteristic of the nuclear hormone receptors. Further analyses showed that both H-2RIIBP and the natural region II factor bind to the estrogen response element (ERE) of the vitellogenin A2 gene. The ERE is composed of a palindrome, and half of this palindrome resembles the region II binding site of the MHC CRE. These results indicate that H-2RIIBP (i) is a member of the superfamily of nuclear hormone receptors and (ii) may regulate not only MHC class I genes but also genes containing the ERE and related sequences. Sequences homologous to the H-2RIIBP gene are widely conserved in the animal kingdom. H-2RIIBP mRNA is expressed in many mouse tissues, in agreement with the distribution of the natural region II factor.
RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes
Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle
2016-01-01
ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233
Bousquet, François; Nojima, Tetsuya; Houot, Benjamin; Chauvel, Isabelle; Chaudy, Sylvie; Dupas, Stéphane; Yamamoto, Daisuke; Ferveur, Jean-François
2012-01-01
Animals often use sex pheromones for mate choice and reproduction. As for other signals, the genetic control of the emission and perception of sex pheromones must be tightly coadapted, and yet we still have no worked-out example of how these two aspects interact. Most models suggest that emission and perception rely on separate genetic control. We have identified a Drosophila melanogaster gene, desat1, that is involved in both the emission and the perception of sex pheromones. To explore the mechanism whereby these two aspects of communication interact, we investigated the relationship between the molecular structure, tissue-specific expression, and pheromonal phenotypes of desat1. We characterized the five desat1 transcripts—all of which yielded the same desaturase protein—and constructed transgenes with the different desat1 putative regulatory regions. Each region was used to target reporter transgenes with either (i) the fluorescent GFP marker to reveal desat1 tissue expression, or (ii) the desat1 RNAi sequence to determine the effects of genetic down-regulation on pheromonal phenotypes. We found that desat1 is expressed in a variety of neural and nonneural tissues, most of which are involved in reproductive functions. Our results suggest that distinct desat1 putative regulatory regions independently drive the expression in nonneural and in neural cells, such that the emission and perception of sex pheromones are precisely coordinated in this species. PMID:22114190
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Kyung-Mi; Yun, Ji Ho; Lee, Dong Hwa
2015-04-17
We demonstrate that chikusetsusaponin IVa methyl ester (CME), a triterpenoid saponin from the root of Achyranthes japonica, has an anticancer activity. We investigate its molecular mechanism in depth in HCT116 cells. CME reduces the amount of β-catenin in nucleus and inhibits the binding of β-catenin to specific DNA sequences (TCF binding elements, TBE) in target gene promoters. Thus, CME appears to decrease the expression of cell cycle regulatory proteins such as Cyclin D1, as a representative target for β-catenin, as well as CDK2 and CDK4. As a result of the decrease of the cell cycle regulatory proteins, CME inhibits cellmore » proliferation by arresting the cell cycle at the G0/G1 phase. Therefore, we suggest that CME as a novel Wnt/β-catenin inhibitor can be a putative agent for the treatment of colorectal cancers. - Highlights: • CME inhibits cell proliferation in HCT116 cells. • CME increases cell cycle arrest at G0/G1 phase and apoptosis. • CME attenuates cyclin D1 and regulates cell cycle regulatory proteins. • CME inhibits β-catenin translocation to nucleus.« less
Flot, Jean-François; Tillier, Simon
2007-10-15
The complete mitochondrial genomes of two individuals attributed to different morphospecies of the scleractinian coral genus Pocillopora have been sequenced. Both genomes, respectively 17,415 and 17,422 nt long, share the presence of a previously undescribed ORF encoding a putative protein made up of 302 amino acids and of unknown function. Surprisingly, this ORF turns out to be the second most variable region of the mitochondrial genome (1% nucleotide sequence difference between the two individuals) after the putative control region (1.5% sequence difference). Except for the presence of this ORF and for the location of the putative control region, the mitochondrial genome of Pocillopora is organized in a fashion similar to the other scleractinian coral genomes published to date. For the first time in a cnidarian, a putative second origin of replication is described based on its secondary structure similar to the stem-loop structure of O(L), the origin of L-strand replication in vertebrates.
Sánchez-García, Ana Belén; Ibáñez, Sergio; Cano, Antonio; Acosta, Manuel; Pérez-Pérez, José Manuel
2018-01-01
Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation.
Cano, Antonio; Acosta, Manuel
2018-01-01
Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation. PMID:29709027
Complete Genome Sequence of a Putative Densovirus of the Asian Citrus Psyllid, Diaphorina citri.
Nigg, Jared C; Nouri, Shahideh; Falk, Bryce W
2016-07-28
Here, we report the complete genome sequence of a putative densovirus of the Asian citrus psyllid, Diaphorina citri Diaphorina citri densovirus (DcDNV) was originally identified through metagenomics, and here, we obtained the complete nucleotide sequence using PCR-based approaches. Phylogenetic analysis places DcDNV between viruses of the Ambidensovirus and Iteradensovirus genera. Copyright © 2016 Nigg et al.
Kirm, Benjamin; Magdevska, Vasilka; Tome, Miha; Horvat, Marinka; Karničar, Katarina; Petek, Marko; Vidmar, Robert; Baebler, Spela; Jamnik, Polona; Fujs, Štefan; Horvat, Jaka; Fonovič, Marko; Turk, Boris; Gruden, Kristina; Petković, Hrvoje; Kosec, Gregor
2013-12-17
Erythromycin is a medically important antibiotic, biosynthesized by the actinomycete Saccharopolyspora erythraea. Genes encoding erythromycin biosynthesis are organized in a gene cluster, spanning over 60 kbp of DNA. Most often, gene clusters encoding biosynthesis of secondary metabolites contain regulatory genes. In contrast, the erythromycin gene cluster does not contain regulatory genes and regulation of its biosynthesis has therefore remained poorly understood, which has for a long time limited genetic engineering approaches for erythromycin yield improvement. We used a comparative proteomic approach to screen for potential regulatory proteins involved in erythromycin biosynthesis. We have identified a putative regulatory protein SACE_5599 which shows significantly higher levels of expression in an erythromycin high-producing strain, compared to the wild type S. erythraea strain. SACE_5599 is a member of an uncharacterized family of putative regulatory genes, located in several actinomycete biosynthetic gene clusters. Importantly, increased expression of SACE_5599 was observed in the complex fermentation medium and at controlled bioprocess conditions, simulating a high-yield industrial fermentation process in the bioreactor. Inactivation of SACE_5599 in the high-producing strain significantly reduced erythromycin yield, in addition to drastically decreasing sporulation intensity of the SACE_5599-inactivated strains when cultivated on ABSM4 agar medium. In contrast, constitutive overexpression of SACE_5599 in the wild type NRRL23338 strain resulted in an increase of erythromycin yield by 32%. Similar yield increase was also observed when we overexpressed the bldD gene, a previously identified regulator of erythromycin biosynthesis, thereby for the first time revealing its potential for improving erythromycin biosynthesis. SACE_5599 is the second putative regulatory gene to be identified in S. erythraea which has positive influence on erythromycin yield. Like bldD, SACE_5599 is involved in morphological development of S. erythraea, suggesting a very close relationship between secondary metabolite biosynthesis and morphological differentiation in this organism. While the mode of action of SACE_5599 remains to be elucidated, the manipulation of this gene clearly shows potential for improvement of erythromycin production in S. erythraea in industrial setting. We have also demonstrated the applicability of the comparative proteomics approach for identifying new regulatory elements involved in biosynthesis of secondary metabolites in industrial conditions.
2013-01-01
Background Erythromycin is a medically important antibiotic, biosynthesized by the actinomycete Saccharopolyspora erythraea. Genes encoding erythromycin biosynthesis are organized in a gene cluster, spanning over 60 kbp of DNA. Most often, gene clusters encoding biosynthesis of secondary metabolites contain regulatory genes. In contrast, the erythromycin gene cluster does not contain regulatory genes and regulation of its biosynthesis has therefore remained poorly understood, which has for a long time limited genetic engineering approaches for erythromycin yield improvement. Results We used a comparative proteomic approach to screen for potential regulatory proteins involved in erythromycin biosynthesis. We have identified a putative regulatory protein SACE_5599 which shows significantly higher levels of expression in an erythromycin high-producing strain, compared to the wild type S. erythraea strain. SACE_5599 is a member of an uncharacterized family of putative regulatory genes, located in several actinomycete biosynthetic gene clusters. Importantly, increased expression of SACE_5599 was observed in the complex fermentation medium and at controlled bioprocess conditions, simulating a high-yield industrial fermentation process in the bioreactor. Inactivation of SACE_5599 in the high-producing strain significantly reduced erythromycin yield, in addition to drastically decreasing sporulation intensity of the SACE_5599-inactivated strains when cultivated on ABSM4 agar medium. In contrast, constitutive overexpression of SACE_5599 in the wild type NRRL23338 strain resulted in an increase of erythromycin yield by 32%. Similar yield increase was also observed when we overexpressed the bldD gene, a previously identified regulator of erythromycin biosynthesis, thereby for the first time revealing its potential for improving erythromycin biosynthesis. Conclusions SACE_5599 is the second putative regulatory gene to be identified in S. erythraea which has positive influence on erythromycin yield. Like bldD, SACE_5599 is involved in morphological development of S. erythraea, suggesting a very close relationship between secondary metabolite biosynthesis and morphological differentiation in this organism. While the mode of action of SACE_5599 remains to be elucidated, the manipulation of this gene clearly shows potential for improvement of erythromycin production in S. erythraea in industrial setting. We have also demonstrated the applicability of the comparative proteomics approach for identifying new regulatory elements involved in biosynthesis of secondary metabolites in industrial conditions. PMID:24341557
Lange, Karen I.; Heinrichs, Jeffrey; Cheung, Karen; Srayko, Martin
2013-01-01
Summary Protein phosphorylation and dephosphorylation is a key mechanism for the spatial and temporal regulation of many essential developmental processes and is especially prominent during mitosis. The multi-subunit protein phosphatase 2A (PP2A) enzyme plays an important, yet poorly characterized role in dephosphorylating proteins during mitosis. PP2As are heterotrimeric complexes comprising a catalytic, structural, and regulatory subunit. Regulatory subunits are mutually exclusive and determine subcellular localization and substrate specificity of PP2A. At least 3 different classes of regulatory subunits exist (termed B, B′, B″) but there is no obvious similarity in primary sequence between these classes. Therefore, it is not known how these diverse regulatory subunits interact with the same holoenzyme to facilitate specific PP2A functions in vivo. The B″ family of regulatory subunits is the least understood because these proteins lack conserved structural domains. RSA-1 (regulator of spindle assembly) is a regulatory B″ subunit required for mitotic spindle assembly in Caenorhabditis elegans. In order to address how B″ subunits interact with the PP2A core enzyme, we focused on a conditional allele, rsa-1(or598ts), and determined that this mutation specifically disrupts the protein interaction between RSA-1 and the PP2A structural subunit, PAA-1. Through genetic screening, we identified a putative interface on the PAA-1 structural subunit that interacts with a defined region of RSA-1/B″. In the context of previously published results, these data propose a mechanism of how different PP2A B-regulatory subunit families can bind the same holoenzyme in a mutually exclusive manner, to perform specific tasks in vivo. PMID:23336080
Lange, Karen I; Heinrichs, Jeffrey; Cheung, Karen; Srayko, Martin
2013-01-15
Protein phosphorylation and dephosphorylation is a key mechanism for the spatial and temporal regulation of many essential developmental processes and is especially prominent during mitosis. The multi-subunit protein phosphatase 2A (PP2A) enzyme plays an important, yet poorly characterized role in dephosphorylating proteins during mitosis. PP2As are heterotrimeric complexes comprising a catalytic, structural, and regulatory subunit. Regulatory subunits are mutually exclusive and determine subcellular localization and substrate specificity of PP2A. At least 3 different classes of regulatory subunits exist (termed B, B', B″) but there is no obvious similarity in primary sequence between these classes. Therefore, it is not known how these diverse regulatory subunits interact with the same holoenzyme to facilitate specific PP2A functions in vivo. The B″ family of regulatory subunits is the least understood because these proteins lack conserved structural domains. RSA-1 (regulator of spindle assembly) is a regulatory B″ subunit required for mitotic spindle assembly in Caenorhabditis elegans. In order to address how B″ subunits interact with the PP2A core enzyme, we focused on a conditional allele, rsa-1(or598ts), and determined that this mutation specifically disrupts the protein interaction between RSA-1 and the PP2A structural subunit, PAA-1. Through genetic screening, we identified a putative interface on the PAA-1 structural subunit that interacts with a defined region of RSA-1/B″. In the context of previously published results, these data propose a mechanism of how different PP2A B-regulatory subunit families can bind the same holoenzyme in a mutually exclusive manner, to perform specific tasks in vivo.
Hutchins, Elizabeth D; Eckalbar, Walter L; Wolter, Justin M; Mangone, Marco; Kusumi, Kenro
2016-05-05
Lizards are evolutionarily the most closely related vertebrates to humans that can lose and regrow an entire appendage. Regeneration in lizards involves differential expression of hundreds of genes that regulate wound healing, musculoskeletal development, hormonal response, and embryonic morphogenesis. While microRNAs are able to regulate large groups of genes, their role in lizard regeneration has not been investigated. MicroRNA sequencing of green anole lizard (Anolis carolinensis) regenerating tail and associated tissues revealed 350 putative novel and 196 known microRNA precursors. Eleven microRNAs were differentially expressed between the regenerating tail tip and base during maximum outgrowth (25 days post autotomy), including miR-133a, miR-133b, and miR-206, which have been reported to regulate regeneration and stem cell proliferation in other model systems. Three putative novel differentially expressed microRNAs were identified in the regenerating tail tip. Differentially expressed microRNAs were identified in the regenerating lizard tail, including known regulators of stem cell proliferation. The identification of 3 putative novel microRNAs suggests that regulatory networks, either conserved in vertebrates and previously uncharacterized or specific to lizards, are involved in regeneration. These findings suggest that differential regulation of microRNAs may play a role in coordinating the timing and expression of hundreds of genes involved in regeneration.
Selection of Streptococcus lactis Mutants Defective in Malolactic Fermentation
Renault, Pierre P.; Heslot, Henri
1987-01-01
An enrichment medium and a new sensitive medium were developed to detect malolactic variants in different strains of lactic bacteria. Factors such as the concentration of glucose and l-malate, pH level, and the type of indicator dye used are discussed with regard to the kinetics of malic acid conversion to lactic acid. Use of these media allowed a rapid and easier screening of mutagenized streptococcal cells unable to ferment l-malate. A collection of malolactic-negative mutants of Streptococcus lactis induced by UV, nitrosoguanidine, or transposonal mutagenesis were characterized. The results showed that several mutants were apparently defective in the structural gene of malolactic enzyme, whereas others contained mutations which may either inactivate a putative permease or affect a regulatory sequence. PMID:16347282
Sri, Tanu; Mayee, Pratiksha; Singh, Anandita
2015-09-01
Whole genome sequence analyses allow unravelling such evolutionary consequences of meso-triplication event in Brassicaceae (∼14-20 million years ago (MYA)) as differential gene fractionation and diversification in homeologous sub-genomes. This study presents a simple gene-centric approach involving microsynteny and natural genetic variation analysis for understanding SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1) homeolog evolution in Brassica. Analysis of microsynteny in Brassica rapa homeologous regions containing SOC1 revealed differential gene fractionation correlating to reported fractionation status of sub-genomes of origin, viz. least fractionated (LF), moderately fractionated 1 (MF1) and most fractionated (MF2), respectively. Screening 18 cultivars of 6 Brassica species led to the identification of 8 genomic and 27 transcript variants of SOC1, including splice-forms. Co-occurrence of both interrupted and intronless SOC1 genes was detected in few Brassica species. In silico analysis characterised Brassica SOC1 as MADS intervening, K-box, C-terminal (MIKC(C)) transcription factor, with highly conserved MADS and I domains relative to K-box and C-terminal domain. Phylogenetic analyses and multiple sequence alignments depicting shared pattern of silent/non-silent mutations assigned Brassica SOC1 homologs into groups based on shared diploid base genome. In addition, a sub-genome structure in uncharacterised Brassica genomes was inferred. Expression analysis of putative MF2 and LF (Brassica diploid base genome A (AA)) sub-genome-specific SOC1 homeologs of Brassica juncea revealed near identical expression pattern. However, MF2-specific homeolog exhibited significantly higher expression implying regulatory diversification. In conclusion, evidence for polyploidy-induced sequence and regulatory evolution in Brassica SOC1 is being presented wherein differential homeolog expression is implied in functional diversification.
Regulatory role of XynR (YagI) in catabolism of xylonate in Escherichia coli K-12.
Shimada, Tomohiro; Momiyama, Eri; Yamanaka, Yuki; Watanabe, Hiroki; Yamamoto, Kaneyoshi; Ishihama, Akira
2017-12-01
The genome of Escherichia coli K-12 contains ten cryptic phages, altogether constituting about 3.6% of the genome in sequence. Among more than 200 predicted genes in these cryptic phages, 14 putative transcription factor (TF) genes exist, but their regulatory functions remain unidentified. As an initial attempt to make a breakthrough for understanding the regulatory roles of cryptic phage-encoded TFs, we tried to identify the regulatory function of CP4-6 cryptic prophage-encoded YagI with unknown function. After SELEX screening, YagI was found to bind mainly at a single site within the spacer of bidirectional transcription units, yagA (encoding another uncharacterized TF) and yagEF (encoding 2-keto-3-deoxy gluconate aldolase, and dehydratase, respectively) within this prophage region. YagEF enzymes are involved in the catabolism of xylose downstream from xylonate. We then designated YagI as XynR (regulator of xylonate catabolism), one of the rare single-target TFs. In agreement with this predicted regulatory function, the activity of XynR was suggested to be controlled by xylonate. Even though low-affinity binding sites of XynR were identified in the E. coli K-12 genome, they all were inside open reading frames, implying that the regulation network of XynR is still fixed within the CR4-6 prophage without significant influence over the host E. coli K-12. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Moskvin, Oleg V; Bolotin, Dmitry; Wang, Andrew; Ivanov, Pavel S; Gomelsky, Mark
2011-02-01
We present Rhodobase, a web-based meta-analytical tool for analysis of transcriptional regulation in a model anoxygenic photosynthetic bacterium, Rhodobacter sphaeroides. The gene association meta-analysis is based on the pooled data from 100 of R. sphaeroides whole-genome DNA microarrays. Gene-centric regulatory networks were visualized using the StarNet approach (Jupiter, D.C., VanBuren, V., 2008. A visual data mining tool that facilitates reconstruction of transcription regulatory networks. PLoS ONE 3, e1717) with several modifications. We developed a means to identify and visualize operons and superoperons. We designed a framework for the cross-genome search for transcription factor binding sites that takes into account high GC-content and oligonucleotide usage profile characteristic of the R. sphaeroides genome. To facilitate reconstruction of directional relationships between co-regulated genes, we screened upstream sequences (-400 to +20bp from start codons) of all genes for putative binding sites of bacterial transcription factors using a self-optimizing search method developed here. To test performance of the meta-analysis tools and transcription factor site predictions, we reconstructed selected nodes of the R. sphaeroides transcription factor-centric regulatory matrix. The test revealed regulatory relationships that correlate well with the experimentally derived data. The database of transcriptional profile correlations, the network visualization engine and the optimized search engine for transcription factor binding sites analysis are available at http://rhodobase.org. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
2011-01-01
Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
2012-01-01
Background Salmonella enterica serotype Typhimurium produces surface-associated fimbriae that facilitate adherence of the bacteria to a variety of cells and tissues. Type 1 fimbriae with binding specificity to mannose residues are the most commonly found fimbrial type. In vitro, static-broth culture favors the growth of S. Typhimurium with type 1 fimbriae, whereas non-type 1 fimbriate bacteria are obtained by culture on solid-agar media. Previous studies demonstrated that the phenotypic expression of type 1 fimbriae is the result of the interaction and cooperation of the regulatory genes fimZ, fimY, fimW, and fimU within the fim gene cluster. Genome sequencing revealed a novel gene, stm0551, located between fimY and fimW that encodes an 11.4-kDa putative phosphodiesterase specific for the bacterial second messenger cyclic-diguanylate monophosphate (c-di-GMP). The role of stm0551 in the regulation of type 1 fimbriae in S. Typhimurium remains unclear. Results A stm0551-deleted stain constructed by allelic exchange constitutively produced type 1 fimbriae in both static-broth and solid-agar medium conditions. Quantative RT-PCR revealed that expression of the fimbrial major subunit gene, fimA, and one of the regulatory genes, fimZ, were comparably increased in the stm0551-deleted strain compared with those of the parental strain when grown on the solid-agar medium, a condition that normally inhibits expression of type 1 fimbriae. Following transformation with a plasmid possessing the coding sequence of stm0551, expression of fimA and fimZ decreased in the stm0551 mutant strain in both culture conditions, whereas transformation with the control vector pACYC184 relieved this repression. A purified STM0551 protein exhibited a phosphodiesterase activity in vitro while a point mutation in the putative EAL domain, substituting glutamic acid (E) with alanine (A), of STM0551 or a FimY protein abolished this activity. Conclusions The finding that the stm0551 gene plays a negative regulatory role in the regulation of type 1 fimbriae in S. Typhimurium has not been reported previously. The possibility that degradation of c-di-GMP is a key step in the regulation of type 1 fimbriae warrants further investigation. PMID:22716649
Gut Microbiome and Putative Resistome of Inca and Italian Nobility Mummies
Santiago-Rodriguez, Tasha M.; Luciani, Stefania; Toranzos, Gary A.; Marota, Isolina; Giuffra, Valentina; Cano, Raul J.
2017-01-01
Little is still known about the microbiome resulting from the process of mummification of the human gut. In the present study, the gut microbiota, genes associated with metabolism, and putative resistome of Inca and Italian nobility mummies were characterized by using high-throughput sequencing. The Italian nobility mummies exhibited a higher bacterial diversity as compared to the Inca mummies when using 16S ribosomal (rRNA) gene amplicon sequencing, but both groups showed bacterial and fungal taxa when using shotgun metagenomic sequencing that may resemble both the thanatomicrobiome and extant human gut microbiomes. Identification of sequences associated with plants, animals, and carbohydrate-active enzymes (CAZymes) may provide further insights into the dietary habits of Inca and Italian nobility mummies. Putative antibiotic-resistance genes in the Inca and Italian nobility mummies support a human gut resistome prior to the antibiotic therapy era. The higher proportion of putative antibiotic-resistance genes in the Inca compared to Italian nobility mummies may support the hypotheses that a greater exposure to the environment may result in a greater acquisition of antibiotic-resistance genes. The present study adds knowledge of the microbiome resulting from the process of mummification of the human gut, insights of ancient dietary habits, and the preserved putative human gut resistome prior the antibiotic therapy era. PMID:29112136
Gut Microbiome and Putative Resistome of Inca and Italian Nobility Mummies.
Santiago-Rodriguez, Tasha M; Fornaciari, Gino; Luciani, Stefania; Toranzos, Gary A; Marota, Isolina; Giuffra, Valentina; Cano, Raul J
2017-11-07
Little is still known about the microbiome resulting from the process of mummification of the human gut. In the present study, the gut microbiota, genes associated with metabolism, and putative resistome of Inca and Italian nobility mummies were characterized by using high-throughput sequencing. The Italian nobility mummies exhibited a higher bacterial diversity as compared to the Inca mummies when using 16S ribosomal (rRNA) gene amplicon sequencing, but both groups showed bacterial and fungal taxa when using shotgun metagenomic sequencing that may resemble both the thanatomicrobiome and extant human gut microbiomes. Identification of sequences associated with plants, animals, and carbohydrate-active enzymes (CAZymes) may provide further insights into the dietary habits of Inca and Italian nobility mummies. Putative antibiotic-resistance genes in the Inca and Italian nobility mummies support a human gut resistome prior to the antibiotic therapy era. The higher proportion of putative antibiotic-resistance genes in the Inca compared to Italian nobility mummies may support the hypotheses that a greater exposure to the environment may result in a greater acquisition of antibiotic-resistance genes. The present study adds knowledge of the microbiome resulting from the process of mummification of the human gut, insights of ancient dietary habits, and the preserved putative human gut resistome prior the antibiotic therapy era.
Evolution of UCP1 Transcriptional Regulatory Elements Across the Mammalian Phylogeny
Gaudry, Michael J.; Campbell, Kevin L.
2017-01-01
Uncoupling protein 1 (UCP1) permits non-shivering thermogenesis (NST) when highly expressed in brown adipose tissue (BAT) mitochondria. Exclusive to placental mammals, BAT has commonly been regarded to be advantageous for thermoregulation in hibernators, small-bodied species, and the neonates of larger species. While numerous regulatory control motifs associated with UCP1 transcription have been proposed for murid rodents, it remains unclear whether these are conserved across the eutherian mammal phylogeny and hence essential for UCP1 expression. To address this shortcoming, we conducted a broad comparative survey of putative UCP1 transcriptional regulatory elements in 139 mammals (135 eutherians). We find no evidence for presence of a UCP1 enhancer in monotremes and marsupials, supporting the hypothesis that this control region evolved in a stem eutherian ancestor. We additionally reveal that several putative promoter elements (e.g., CRE-4, CCAAT) identified in murid rodents are not conserved among BAT-expressing eutherians, and together with the putative regulatory region (PRR) and CpG island do not appear to be crucial for UCP1 expression. The specificity and importance of the upTRE, dnTRE, URE1, CRE-2, RARE-2, NBRE, BRE-1, and BRE-2 enhancer elements first described from rats and mice are moreover uncertain as these motifs differ substantially—but generally remain highly conserved—in other BAT-expressing eutherians. Other UCP1 enhancer motifs (CRE-3, PPRE, and RARE-3) as well as the TATA box are also highly conserved in nearly all eutherian lineages with an intact UCP1. While these transcriptional regulatory motifs are generally also maintained in species where this gene is pseudogenized, the loss or degeneration of key basal promoter (e.g., TATA box) and enhancer elements in other UCP1-lacking lineages make it unlikely that the enhancer region is pleiotropic (i.e., co-regulates additional genes). Importantly, differential losses of (or mutations within) putative regulatory elements among the eutherian lineages with an intact UCP1 suggests that the transcriptional control of gene expression is not highly conserved in this mammalian clade. PMID:28979209
Systematic analysis and evolution of 5S ribosomal DNA in metazoans.
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-11-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12,766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades.
Systematic analysis and evolution of 5S ribosomal DNA in metazoans
Vierna, J; Wehner, S; Höner zu Siederdissen, C; Martínez-Lage, A; Marz, M
2013-01-01
Several studies on 5S ribosomal DNA (5S rDNA) have been focused on a subset of the following features in mostly one organism: number of copies, pseudogenes, secondary structure, promoter and terminator characteristics, genomic arrangements, types of non-transcribed spacers and evolution. In this work, we systematically analyzed 5S rDNA sequence diversity in available metazoan genomes, and showed organism-specific and evolutionary-conserved features. Putatively functional sequences (12 766) from 97 organisms allowed us to identify general features of this multigene family in animals. Interestingly, we show that each mammal species has a highly conserved (housekeeping) 5S rRNA type and many variable ones. The genomic organization of 5S rDNA is still under debate. Here, we report the occurrence of several paralog 5S rRNA sequences in 58 of the examined species, and a flexible genome organization of 5S rDNA in animals. We found heterogeneous 5S rDNA clusters in several species, supporting the hypothesis of an exchange of 5S rDNA from one locus to another. A rather high degree of variation of upstream, internal and downstream putative regulatory regions appears to characterize metazoan 5S rDNA. We systematically studied the internal promoters and described three different types of termination signals, as well as variable distances between the coding region and the typical termination signal. Finally, we present a statistical method for detection of linkage among noncoding RNA (ncRNA) gene families. This method showed no evolutionary-conserved linkage among 5S rDNAs and any other ncRNA genes within Metazoa, even though we found 5S rDNA to be linked to various ncRNAs in several clades. PMID:23838690
Paal, Jürgen; Henselewski, Heike; Muth, Jost; Meksem, Khalid; Menéndez, Cristina M; Salamini, Francesco; Ballvora, Agim; Gebhardt, Christiane
2004-04-01
The endoparasitic root cyst nematode Globodera rostochiensis causes considerable damage in potato cultivation. In the past, major genes for nematode resistance have been introgressed from related potato species into cultivars. Elucidating the molecular basis of resistance will contribute to the understanding of nematode-plant interactions and assist in breeding nematode-resistant cultivars. The Gro1 resistance locus to G. rostochiensis on potato chromosome VII co-localized with a resistance-gene-like (RGL) DNA marker. This marker was used to isolate from genomic libraries 15 members of a closely related candidate gene family. Analysis of inheritance, linkage mapping, and sequencing reduced the number of candidate genes to three. Complementation analysis by stable potato transformation showed that the gene Gro1-4 conferred resistance to G. rostochiensis pathotype Ro1. Gro1-4 encodes a protein of 1136 amino acids that contains Toll-interleukin 1 receptor (TIR), nucleotide-binding (NB), leucine-rich repeat (LRR) homology domains and a C-terminal domain with unknown function. The deduced Gro1-4 protein differed by 29 amino acid changes from susceptible members of the Gro1 gene family. Sequence characterization of 13 members of the Gro1 gene family revealed putative regulatory elements and a variable microsatellite in the promoter region, insertion of a retrotransposon-like element in the first intron, and a stop codon in the NB coding region of some genes. Sequence analysis of RT-PCR products showed that Gro1-4 is expressed, among other members of the family including putative pseudogenes, in non-infected roots of nematode-resistant plants. RT-PCR also demonstrated that members of the Gro1 gene family are expressed in most potato tissues.
Ewulonu, U K; Snyder, L; Silver, L M; Schimenti, J C
1996-03-01
Transgenic mice were generated to localize essential promoter elements in the mouse testis-expressed Tcp-10 genes. These genes are expressed exclusively in male germ cells, and exhibit a diffuse range of transcriptional start sites, possibly due to the absence of a TATA box. A series of transgene constructs containing different amounts of 5' flanking DNA revealed that all sequences necessary for appropriate temporal and tissue-specific transcription of Tcp-10 reside between positions -1 to -973. All transgenic animals containing these sequences expressed a chimeric transgene at high levels, in a pattern that paralleled the endogenous genes. These experiments further defined a 227 bp fragment from -746 to -973 that was absolutely essential for expression. In a gel-shift assay, this 227-bp fragment bound nuclear protein from testis, but not other tissues, to yield two retarded bands. Sequence analysis of this fragment revealed a half-site for the AP-2 transcription factor recognition sequence. Gel shift assays using native or mutant oligonucleotides demonstrated that the putative AP-2 recognition sequence was essential for generating the retarded bands. Since the binding activity is testis-specific, but AP-2 expression is not exclusive to male germ cells, it is possible that transcription of Tcp-10 requires interaction between AP-2 and a germ cell-specific transcription factor.
Ward, Lucas D; Kellis, Manolis
2016-01-04
More than 90% of common variants associated with complex traits do not affect proteins directly, but instead the circuits that control gene expression. This has increased the urgency of understanding the regulatory genome as a key component for translating genetic results into mechanistic insights and ultimately therapeutics. To address this challenge, we developed HaploReg (http://compbio.mit.edu/HaploReg) to aid the functional dissection of genome-wide association study (GWAS) results, the prediction of putative causal variants in haplotype blocks, the prediction of likely cell types of action, and the prediction of candidate target genes by systematic mining of comparative, epigenomic and regulatory annotations. Since first launching the website in 2011, we have greatly expanded HaploReg, increasing the number of chromatin state maps to 127 reference epigenomes from ENCODE 2012 and Roadmap Epigenomics, incorporating regulator binding data, expanding regulatory motif disruption annotations, and integrating expression quantitative trait locus (eQTL) variants and their tissue-specific target genes from GTEx, Geuvadis, and other recent studies. We present these updates as HaploReg v4, and illustrate a use case of HaploReg for attention deficit hyperactivity disorder (ADHD)-associated SNPs with putative brain regulatory mechanisms. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Systems analysis identifies miR-29b regulation of invasiveness in melanoma.
Andrews, Miles C; Cursons, Joseph; Hurley, Daniel G; Anaka, Matthew; Cebon, Jonathan S; Behren, Andreas; Crampin, Edmund J
2016-11-16
In many cancers, microRNAs (miRs) contribute to metastatic progression by modulating phenotypic reprogramming processes such as epithelial-mesenchymal plasticity. This can be driven by miRs targeting multiple mRNA transcripts, inducing regulated changes across large sets of genes. The miR-target databases TargetScan and DIANA-microT predict putative relationships by examining sequence complementarity between miRs and mRNAs. However, it remains a challenge to identify which miR-mRNA interactions are active at endogenous expression levels, and of biological consequence. We developed a workflow to integrate TargetScan and DIANA-microT predictions into the analysis of data-driven associations calculated from transcript abundance (RNASeq) data, specifically the mutual information and Pearson's correlation metrics. We use this workflow to identify putative relationships of miR-mediated mRNA repression with strong support from both lines of evidence. Applying this approach systematically to a large, published collection of unique melanoma cell lines - the Ludwig Melbourne melanoma (LM-MEL) cell line panel - we identified putative miR-mRNA interactions that may contribute to invasiveness. This guided the selection of interactions of interest for further in vitro validation studies. Several miR-mRNA regulatory relationships supported by TargetScan and DIANA-microT demonstrated differential activity across cell lines of varying matrigel invasiveness. Strong negative statistical associations for these putative regulatory relationships were consistent with target mRNA inhibition by the miR, and suggest that differential activity of such miR-mRNA relationships contribute to differences in melanoma invasiveness. Many of these relationships were reflected across the skin cutaneous melanoma TCGA dataset, indicating that these observations also show graded activity across clinical samples. Several of these miRs are implicated in cancer progression (miR-211, -340, -125b, -221, and -29b). The specific role for miR-29b-3p in melanoma has not been well studied. We experimentally validated the predicted miR-29b-3p regulation of LAMC1 and PPIC and LASP1, and show that dysregulation of miR-29b-3p or these mRNA targets can influence cellular invasiveness in vitro. This analytic strategy provides a comprehensive, systems-level approach to identify miR-mRNA regulation in high-throughput cancer data, identifies novel putative interactions with functional phenotypic relevance, and can be used to direct experimental resources for subsequent experimental validation. Computational scripts are available: http://github.com/uomsystemsbiology/LMMEL-miR-miner.
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium
2010-01-01
Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586
Identification and subcellular localization of porcine deltacoronavirus accessory protein NS6
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fang, Puxian; Fang, Liurong; Liu, Xiaorong
Porcine deltacoronavirus (PDCoV) is an emerging swine enteric coronavirus. Accessory proteins are genus-specific for coronavirus, and two putative accessory proteins, NS6 and NS7, are predicted to be encoded by PDCoV; however, this remains to be confirmed experimentally. Here, we identified the leader-body junction sites of NS6 subgenomic RNA (sgRNA) and found that the actual transcription regulatory sequence (TRS) utilized by NS6 is non-canonical and is located upstream of the predicted TRS. Using the purified NS6 from an Escherichia coli expression system, we obtained two anti-NS6 monoclonal antibodies that could detect the predicted NS6 in cells infected with PDCoV or transfectedmore » with NS6-expressing plasmids. Further studies revealed that NS6 is always localized in the cytoplasm of PDCoV-infected cells, mainly co-localizing with the endoplasmic reticulum (ER) and ER-Golgi intermediate compartments, as well as partially with the Golgi apparatus. Together, our results identify the NS6 sgRNA and demonstrate its expression in PDCoV-infected cells. -- Highlights: •The leader-body fusion site of NS6 sgRNA is identified. •NS6 sgRNA uses a non-canonical transcription regulatory sequence (TRS). •NS6 can be expressed in PDCoV-infected cell. •NS6 predominantly localize to the ER complex and ER-Golgi intermediate compartment.« less
Two novel genes, fanA and fanB, involved in the biogenesis of K99 fimbriae.
Roosendaal, E; Boots, M; de Graaf, F K
1987-08-11
The nucleotide sequence of the region located transcriptionally upstream of the K99 fimbrial subunit gene (fanC) was determined. Several putative transcription signals and two open reading frames, designated fanA and fanB, became apparent. Frameshift mutations in fanA and fanB reduced K99 fimbriae expression 8-fold and 16-fold, respectively. Complementation of the mutants in trans restored the K99 expression to about 75% of the wild type level, indicating that fanA and fanB code for transacting polypeptides involved in the biogenesis of K99 fimbriae. The fanA and fanB gene products FanA and FanB were not detectable in minicell preparations, indicating that both polypeptides are synthesized in very small amounts. However, in an in vitro DNA directed translation system FanA and FanB could be identified. The deduced amino acid sequences of FanA and FanB showed that both polypeptides contain no signal peptides, indicating a cytoplasmic location. Furthermore, the polypeptides are very hydrophilic, mainly basic, and exhibit remarkable homology to each other and to a regulatory protein (papB) encoded by the pap-operon (1). Some of these features are characteristics of nucleic acid binding proteins, which suggests that FanA and FanB have a regulatory function in the synthesis of FanC and the auxiliary polypeptides FanD-H.
Ogah, Danlami Moses; Iannaccone, Marco; Erhardt, Georg; Di Stasio, Liliana; Cosenza, Gianfranco
2018-01-01
Oxytocin is a neurohypophysial peptide linked to a wide range of biological functions, including milk ejection, temperament and reproduction. Aims of the present study were a) the characterization of the OXT (Oxytocin-neurophysin I) gene and its regulatory regions in Old and New world camelids; b) the investigation of the genetic diversity and the discovery of markers potentially affecting the gene regulation. On average, the gene extends over 814 bp, ranging between 825 bp in dromedary, 811 bp in Bactrian and 810 bp in llama and alpaca. Such difference in size is due to a duplication event of 21 bp in dromedary. The main regulatory elements, including the composite hormone response elements (CHREs), were identified in the promoter, whereas the presence of mature microRNAs binding sequences in the 3’UTR improves the knowledge on the factors putatively involved in the OXT gene regulation, although their specific biological effect needs to be still elucidated. The sequencing of genomic DNA allowed the identification of 17 intraspecific polymorphisms and 69 nucleotide differences among the four species. One of these (MF464535:g.622C>G) is responsible, in alpaca, for the loss of a consensus sequence for the transcription factor SP1. Furthermore, the same SNP falls within a CpG island and it creates a new methylation site, thus opening future possibilities of investigation to verify the influence of the novel allelic variant in the OXT gene regulation. A PCR-RFLP method was setup for the genotyping and the frequency of the allele C was 0.93 in a population of 71 alpacas. The obtained data clarify the structure of OXT gene in domestic camelids and add knowledge to the genetic variability of a genomic region, which has received little investigation so far. These findings open the opportunity for new investigations, including association studies with productive and reproductive traits. PMID:29608621
Pauciullo, Alfredo; Ogah, Danlami Moses; Iannaccone, Marco; Erhardt, Georg; Di Stasio, Liliana; Cosenza, Gianfranco
2018-01-01
Oxytocin is a neurohypophysial peptide linked to a wide range of biological functions, including milk ejection, temperament and reproduction. Aims of the present study were a) the characterization of the OXT (Oxytocin-neurophysin I) gene and its regulatory regions in Old and New world camelids; b) the investigation of the genetic diversity and the discovery of markers potentially affecting the gene regulation. On average, the gene extends over 814 bp, ranging between 825 bp in dromedary, 811 bp in Bactrian and 810 bp in llama and alpaca. Such difference in size is due to a duplication event of 21 bp in dromedary. The main regulatory elements, including the composite hormone response elements (CHREs), were identified in the promoter, whereas the presence of mature microRNAs binding sequences in the 3'UTR improves the knowledge on the factors putatively involved in the OXT gene regulation, although their specific biological effect needs to be still elucidated. The sequencing of genomic DNA allowed the identification of 17 intraspecific polymorphisms and 69 nucleotide differences among the four species. One of these (MF464535:g.622C>G) is responsible, in alpaca, for the loss of a consensus sequence for the transcription factor SP1. Furthermore, the same SNP falls within a CpG island and it creates a new methylation site, thus opening future possibilities of investigation to verify the influence of the novel allelic variant in the OXT gene regulation. A PCR-RFLP method was setup for the genotyping and the frequency of the allele C was 0.93 in a population of 71 alpacas. The obtained data clarify the structure of OXT gene in domestic camelids and add knowledge to the genetic variability of a genomic region, which has received little investigation so far. These findings open the opportunity for new investigations, including association studies with productive and reproductive traits.
Melendrez, Melanie C.; Lange, Rachel K.; Cohan, Frederick M.; Ward, David M.
2011-01-01
Previous research has shown that sequences of 16S rRNA genes and 16S-23S rRNA internal transcribed spacer regions may not have enough genetic resolution to define all ecologically distinct Synechococcus populations (ecotypes) inhabiting alkaline, siliceous hot spring microbial mats. To achieve higher molecular resolution, we studied sequence variation in three protein-encoding loci sampled by PCR from 60°C and 65°C sites in the Mushroom Spring mat (Yellowstone National Park, WY). Sequences were analyzed using the ecotype simulation (ES) and AdaptML algorithms to identify putative ecotypes. Between 4 and 14 times more putative ecotypes were predicted from variation in protein-encoding locus sequences than from variation in 16S rRNA and 16S-23S rRNA internal transcribed spacer sequences. The number of putative ecotypes predicted depended on the number of sequences sampled and the molecular resolution of the locus. Chao estimates of diversity indicated that few rare ecotypes were missed. Many ecotypes hypothesized by sequence analyses were different in their habitat specificities, suggesting different adaptations to temperature or other parameters that vary along the flow channel. PMID:21169433
Nouri, Shahideh; Salem, Nidà; Falk, Bryce W
2016-07-21
We present here the complete nucleotide sequence and genome organization of a novel putative RNA virus identified in field populations of the Asian citrus psyllid, Diaphorina citri, through sequencing of the transcriptome followed by reverse transcription-PCR (RT-PCR). We tentatively named this virus Diaphorina citri-associated C virus (DcACV). DcACV is an unclassified positive-sense RNA virus. Copyright © 2016 Nouri et al.
Complete genome sequence of an avian paramyxovirus representative of putative new serotype 13
USDA-ARS?s Scientific Manuscript database
Here, we report the complete genome sequence of a virus of a putative new serotype of avian paramyxovirus (APMV). The virus was isolated from a white-fronted goose in Ukraine in 2011 and designated white-fronted goose/Ukraine/Askania-Nova/48-15- 02/2011. The genomic characterization of the isolate s...
Badr, Eman; ElHefnawi, Mahmoud; Heath, Lenwood S
2016-01-01
Alternative splicing is a vital process for regulating gene expression and promoting proteomic diversity. It plays a key role in tissue-specific expressed genes. This specificity is mainly regulated by splicing factors that bind to specific sequences called splicing regulatory elements (SREs). Here, we report a genome-wide analysis to study alternative splicing on multiple tissues, including brain, heart, liver, and muscle. We propose a pipeline to identify differential exons across tissues and hence tissue-specific SREs. In our pipeline, we utilize the DEXSeq package along with our previously reported algorithms. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, we identified 28,100 differentially used exons across the four tissues. We identified tissue-specific exonic splicing enhancers that overlap with various previously published experimental and computational databases. A complicated exonic enhancer regulatory network was revealed, where multiple exonic enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the exonic enhancers are found to be co-occurring with multiple exonic silencers and vice versa, which demonstrates a complicated relationship between tissue-specific exonic enhancers and silencers.
Analysis of Ribosome Stalling and Translation Elongation Dynamics by Deep Learning.
Zhang, Sai; Hu, Hailin; Zhou, Jingtian; He, Xuan; Jiang, Tao; Zeng, Jianyang
2017-09-27
Ribosome stalling is manifested by the local accumulation of ribosomes at specific codon positions of mRNAs. Here, we present ROSE, a deep learning framework to analyze high-throughput ribosome profiling data and estimate the probability of a ribosome stalling event occurring at each genomic location. Extensive validation tests on independent data demonstrated that ROSE possessed higher prediction accuracy than conventional prediction models, with an increase in the area under the receiver operating characteristic curve by up to 18.4%. In addition, genome-wide statistical analyses showed that ROSE predictions can be well correlated with diverse putative regulatory factors of ribosome stalling. Moreover, the genome-wide ribosome stalling landscapes of both human and yeast computed by ROSE recovered the functional interplays between ribosome stalling and cotranslational events in protein biogenesis, including protein targeting by the signal recognition particles and protein secondary structure formation. Overall, our study provides a novel method to complement the ribosome profiling techniques and further decipher the complex regulatory mechanisms underlying translation elongation dynamics encoded in the mRNA sequence. Copyright © 2017 Elsevier Inc. All rights reserved.
Coutinho, Pedro M; Andersen, Mikael R; Kolenova, Katarina; vanKuyk, Patricia A; Benoit, Isabelle; Gruben, Birgit S; Trejo-Aguilar, Blanca; Visser, Hans; van Solingen, Piet; Pakula, Tiina; Seiboth, Bernard; Battaglia, Evy; Aguilar-Osorio, Guillermo; de Jong, Jan F; Ohm, Robin A; Aguilar, Mariana; Henrissat, Bernard; Nielsen, Jens; Stålbrand, Henrik; de Vries, Ronald P
2009-03-01
The plant polysaccharide degradative potential of Aspergillus nidulans was analysed in detail and compared to that of Aspergillus niger and Aspergillus oryzae using a combination of bioinformatics, physiology and transcriptomics. Manual verification indicated that 28.4% of the A. nidulans ORFs analysed in this study do not contain a secretion signal, of which 40% may be secreted through a non-classical method.While significant differences were found between the species in the numbers of ORFs assigned to the relevant CAZy families, no significant difference was observed in growth on polysaccharides. Growth differences were observed between the Aspergilli and Podospora anserina, which has a more different genomic potential for polysaccharide degradation, suggesting that large genomic differences are required to cause growth differences on polysaccharides. Differences were also detected between the Aspergilli in the presence of putative regulatory sequences in the promoters of the ORFs of this study and correlation of the presence of putative XlnR binding sites to induction by xylose was detected for A. niger. These data demonstrate differences at genome content, substrate specificity of the enzymes and gene regulation in these three Aspergilli, which likely reflect their individual adaptation to their natural biotope.
Robust dynamics in minimal hybrid models of genetic networks
Perkins, Theodore J.; Wilds, Roy; Glass, Leon
2010-01-01
Many gene-regulatory networks necessarily display robust dynamics that are insensitive to noise and stable under evolution. We propose that a class of hybrid systems can be used to relate the structure of these networks to their dynamics and provide insight into the origin of robustness. In these systems, the genes are represented by logical functions, and the controlling transcription factor protein molecules are real variables, which are produced and destroyed. As the transcription factor concentrations cross thresholds, they control the production of other transcription factors. We discuss mathematical analysis of these systems and show how the concepts of robustness and minimality can be used to generate putative logical organizations based on observed symbolic sequences. We apply the methods to control of the cell cycle in yeast. PMID:20921006
Robust dynamics in minimal hybrid models of genetic networks.
Perkins, Theodore J; Wilds, Roy; Glass, Leon
2010-11-13
Many gene-regulatory networks necessarily display robust dynamics that are insensitive to noise and stable under evolution. We propose that a class of hybrid systems can be used to relate the structure of these networks to their dynamics and provide insight into the origin of robustness. In these systems, the genes are represented by logical functions, and the controlling transcription factor protein molecules are real variables, which are produced and destroyed. As the transcription factor concentrations cross thresholds, they control the production of other transcription factors. We discuss mathematical analysis of these systems and show how the concepts of robustness and minimality can be used to generate putative logical organizations based on observed symbolic sequences. We apply the methods to control of the cell cycle in yeast.
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.
Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A
1999-12-20
The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.
Salehipour, Pouya; Nematzadeh, Mahsa; Mobasheri, Maryam Beigom; Afsharpad, Mandana; Mansouri, Kamran; Modarressi, Mohammad Hossein
2017-09-01
Testis specific gene antigen 10 (TSGA10) is a cancer testis antigen involved in the process of spermatogenesis. TSGA10 could also play an important role in the inhibition of angiogenesis by preventing nuclear localization of HIF-1α. Although it has been shown that TSGA10 messenger RNA (mRNA) is mainly expressed in testis and some tumors, the transcription pattern and regulatory mechanisms of this gene remain largely unknown. Here, we report that human TSGA10 comprises at least 22 exons and generates four different transcript variants. It was identified that using two distinct promoters and splicing of exons 4 and 7 produced these transcript variants, which have the same coding sequence, but the sequence of 5'untanslated region (5'UTR) is different between them. This is significant because conserved regulatory RNA elements like upstream open reading frame (uORF) and putative internal ribosome entry site (IRES) were found in this region which have different combinations in each transcript variant and it may influence translational efficiency of them in normal or unusual environmental conditions like hypoxia. To indicate the transcription pattern of TSGA10 in breast cancer, expression of identified transcript variants was analyzed in 62 breast cancer samples. We found that TSGA10 tends to express variants with shorter 5'UTR and fewer uORF elements in breast cancer tissues. Our study demonstrates for the first time the expression of different TSGA10 transcript variants in testis and breast cancer tissues and provides a first clue to a role of TSGA10 5'UTR in regulation of translation in unusual environmental conditions like hypoxia. Copyright © 2017. Published by Elsevier B.V.
Devaney, Joseph M; Tosi, Laura L; Fritz, David T; Gordish-Dressman, Heather A; Jiang, Shan; Orkunoglu-Suer, Funda E; Gordon, Andrew H; Harmon, Brennan T; Thompson, Paul D; Clarkson, Priscilla M; Angelopoulos, Theodore J; Gordon, Paul M; Moyna, Niall M; Pescatello, Linda S; Visich, Paul S; Zoeller, Robert F; Brandoli, Cinzia; Hoffman, Eric P; Rogers, Melissa B
2009-08-15
A classic morphogen, bone morphogenetic protein 2 (BMP2) regulates the differentiation of pluripotent mesenchymal cells. High BMP2 levels promote osteogenesis or chondrogenesis and low levels promote adipogenesis. BMP2 inhibits myogenesis. Thus, BMP2 synthesis is tightly controlled. Several hundred nucleotides within the 3' untranslated regions of BMP2 genes are conserved from mammals to fishes indicating that the region is under stringent selective pressure. Our analyses indicate that this region controls BMP2 synthesis by post-transcriptional mechanisms. A common A to C single nucleotide polymorphism (SNP) in the BMP2 gene (rs15705, +A1123C) disrupts a putative post-transcriptional regulatory motif within the human ultra-conserved sequence. In vitro studies indicate that RNAs bearing the A or C alleles have different protein binding characteristics in extracts from mesenchymal cells. Reporter genes with the C allele of the ultra-conserved sequence were differentially expressed in mesenchymal cells. Finally, we analyzed MRI data from the upper arm of 517 healthy individuals aged 18-41 years. Individuals with the C/C genotype were associated with lower baseline subcutaneous fat volumes (P = 0.0030) and an increased gain in skeletal muscle volume (P = 0.0060) following resistance training in a cohort of young males. The rs15705 SNP explained 2-4% of inter-individual variability in the measured parameters. The rs15705 variant is one of the first genetic markers that may be exploited to facilitate early diagnosis, treatment, and/or prevention of diseases associated with poor fitness. Furthermore, understanding the mechanisms by which regulatory polymorphisms influence BMP2 synthesis will reveal novel pharmaceutical targets for these disabling conditions. (c) 2009 Wiley-Liss, Inc.
Devaney, Joseph M.; Tosi, Laura L.; Fritz, David T.; Gordish-Dressman, Heather A.; Jiang, Shan; Orkunoglu-Suer, Funda E.; Gordon, Andrew H.; Harmon, Brennan T.; Thompson, Paul D.; Clarkson, Priscilla M.; Angelopoulos, Theodore J.; Gordon, Paul M.; Moyna, Niall M.; Pescatello, Linda S.; Visich, Paul S.; Zoeller, Robert F.; Brandoli, Cinzia; Hoffman, Eric P.; Rogers, Melissa B.
2014-01-01
A classic morphogen, bone morphogenetic protein 2 (BMP2) regulates the differentiation of pluripotent mesenchymal cells. High BMP2 levels promote osteogenesis or chondrogenesis and low levels promote adipogenesis. BMP2 inhibits myogenesis. Thus, BMP2 synthesis is tightly controlled. Several hundred nucleotides within the 3′ untranslated regions of BMP2 genes are conserved from mammals to fishes indicating that the region is under stringent selective pressure. Our analyses indicate that this region controls BMP2 synthesis by post-transcriptional mechanisms. A common A to C single nucleotide polymorphism (SNP) in the BMP2 gene (rs15705, +A1123C) disrupts a putative post-transcriptional regulatory motif within the human ultra-conserved sequence. In vitro studies indicate that RNAs bearing the A or C alleles have different protein binding characteristics in extracts from mesenchymal cells. Reporter genes with the C allele of the ultra-conserved sequence were differentially expressed in mesenchymal cells. Finally, we analyzed MRI data from the upper arm of 517 healthy individuals aged 18–41 years. Individuals with the C/C genotype were associated with lower baseline subcutaneous fat volumes (P = 0.0030) and an increased gain in skeletal muscle volume (P = 0.0060) following resistance training in a cohort of young males. The rs15705 SNP explained 2–4% of inter-individual variability in the measured parameters. The rs15705 variant is one of the first genetic markers that maybe exploited to facilitate early diagnosis, treatment, and/or prevention of diseases associated with poor fitness. Furthermore, understanding the mechanisms by which regulatory polymorphisms influence BMP2 synthesis will reveal novel pharmaceutical targets for these disabling conditions. PMID:19492344
A deep learning method for lincRNA detection using auto-encoder algorithm.
Yu, Ning; Yu, Zeng; Pan, Yi
2017-12-06
RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.
Transcriptional regulation by retinoic acid of interleukin-2 alpha receptors in human B cells.
Bhatti, L; Sidell, N
1994-01-01
In this study, we demonstrated that retinoic acid (RA) up-regulated interleukin-2 receptor-alpha (IL-2R alpha) expression on two human B-cell lines, IE8.6 and SKW6.4. Deleted forms of the human IL-2R alpha promoter linked to the bacterial chloramphenicol acetyltransferase reporter gene were transfected into IE8.6 cells in order to define RA-responsive regulatory domains. Experiments using the -1.6 kb construct, which contains all known regulatory regions in the IL-2R alpha promoter, indicated that RA could induce IL-2R alpha promoter activity. The basal activity of the -471 construct was initially low, but was markedly enhanced by the addition of RA. Deletion of promoter sequences between -471 and -317 resulted in a significant augmentation of basal promoter activity and abolished promoter induction by RA. This finding revealed a requirement for sequences 5' of base -317 for RA-induced promoter activation, raising the possibility of the presence of both a RA response element and a negative regulatory element (NRE) upstream of base -317. Transfection studies with internal deletion mutants with the putative NRE removed resulted in increases in basal promoter activity and unresponsiveness to RA similar to the -317 construct. In contrast, an internal deletion mutant with the NRE intact had low basal activity and was inducible by RA similar to the -471 construct. Taken together, our results suggested that RA-induced activation of the IL-2R alpha promoter was through changes in the function of a NRE present between bases -400 and -368. This 31-base pair element may interact with an adjacent RA-responsive regulatory site as well as being responsible for down-regulation of basal IL-2R alpha expression under certain conditions. Images Figure 3 Figure 4 Figure 5 Figure 6 PMID:8157276
Mars, Ruben A T; Nicolas, Pierre; Denham, Emma L; van Dijl, Jan Maarten
2016-12-01
Bacteria can employ widely diverse RNA molecules to regulate their gene expression. Such molecules include trans-acting small regulatory RNAs, antisense RNAs, and a variety of transcriptional attenuation mechanisms in the 5' untranslated region. Thus far, most regulatory RNA research has focused on Gram-negative bacteria, such as Escherichia coli and Salmonella. Hence, there is uncertainty about whether the resulting insights can be extrapolated directly to other bacteria, such as the Gram-positive soil bacterium Bacillus subtilis. A recent study identified 1,583 putative regulatory RNAs in B. subtilis, whose expression was assessed across 104 conditions. Here, we review the current understanding of RNA-based regulation in B. subtilis, and we categorize the newly identified putative regulatory RNAs on the basis of their conservation in other bacilli and the stability of their predicted secondary structures. Our present evaluation of the publicly available data indicates that RNA-mediated gene regulation in B. subtilis mostly involves elements at the 5' ends of mRNA molecules. These can include 5' secondary structure elements and metabolite-, tRNA-, or protein-binding sites. Importantly, sense-independent segments are identified as the most conserved and structured potential regulatory RNAs in B. subtilis. Altogether, the present survey provides many leads for the identification of new regulatory RNA functions in B. subtilis. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Mars, Ruben A. T.; Nicolas, Pierre; Denham, Emma L.
2016-01-01
SUMMARY Bacteria can employ widely diverse RNA molecules to regulate their gene expression. Such molecules include trans-acting small regulatory RNAs, antisense RNAs, and a variety of transcriptional attenuation mechanisms in the 5′ untranslated region. Thus far, most regulatory RNA research has focused on Gram-negative bacteria, such as Escherichia coli and Salmonella. Hence, there is uncertainty about whether the resulting insights can be extrapolated directly to other bacteria, such as the Gram-positive soil bacterium Bacillus subtilis. A recent study identified 1,583 putative regulatory RNAs in B. subtilis, whose expression was assessed across 104 conditions. Here, we review the current understanding of RNA-based regulation in B. subtilis, and we categorize the newly identified putative regulatory RNAs on the basis of their conservation in other bacilli and the stability of their predicted secondary structures. Our present evaluation of the publicly available data indicates that RNA-mediated gene regulation in B. subtilis mostly involves elements at the 5′ ends of mRNA molecules. These can include 5′ secondary structure elements and metabolite-, tRNA-, or protein-binding sites. Importantly, sense-independent segments are identified as the most conserved and structured potential regulatory RNAs in B. subtilis. Altogether, the present survey provides many leads for the identification of new regulatory RNA functions in B. subtilis. PMID:27784798
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man
Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.
2000-01-01
The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal
2013-01-01
We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456
Matsumura, Emilyn E; Coletta-Filho, Helvecio D; Nouri, Shahideh; Falk, Bryce W; Nerva, Luca; Oliveira, Tiago S; Dorta, Silvia O; Machado, Marcos A
2017-04-24
Citrus sudden death (CSD) has caused the death of approximately four million orange trees in a very important citrus region in Brazil. Although its etiology is still not completely clear, symptoms and distribution of affected plants indicate a viral disease. In a search for viruses associated with CSD, we have performed a comparative high-throughput sequencing analysis of the transcriptome and small RNAs from CSD-symptomatic and -asymptomatic plants using the Illumina platform. The data revealed mixed infections that included Citrus tristeza virus (CTV) as the most predominant virus, followed by the Citrus sudden death-associated virus (CSDaV), Citrus endogenous pararetrovirus (CitPRV) and two putative novel viruses tentatively named Citrus jingmen-like virus (CJLV), and Citrus virga-like virus (CVLV). The deep sequencing analyses were sensitive enough to differentiate two genotypes of both viruses previously associated with CSD-affected plants: CTV and CSDaV. Our data also showed a putative association of the CSD-symptomatic plants with a specific CSDaV genotype and a likely association with CitPRV as well, whereas the two putative novel viruses showed to be more associated with CSD-asymptomatic plants. This is the first high-throughput sequencing-based study of the viral sequences present in CSD-affected citrus plants, and generated valuable information for further CSD studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Biaoyang; Nasir, J.; Kalchman, M.A.
1995-02-10
We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less
Huang, Lin; Li, Guiyang; Mo, Zhaolan; Xiao, Peng; Li, Jie; Huang, Jie
2015-01-01
Background Japanese flounder (Paralichthys olivaceus) is an economically important marine fish in Asia and has suffered from disease outbreaks caused by various pathogens, which requires more information for immune relevant genes on genome background. However, genomic and transcriptomic data for Japanese flounder remain scarce, which limits studies on the immune system of this species. In this study, we characterized the Japanese flounder spleen transcriptome using an Illumina paired-end sequencing platform to identify putative genes involved in immunity. Methodology/Principal Findings A cDNA library from the spleen of P. olivaceus was constructed and randomly sequenced using an Illumina technique. The removal of low quality reads generated 12,196,968 trimmed reads, which assembled into 96,627 unigenes. A total of 21,391 unigenes (22.14%) were annotated in the NCBI Nr database, and only 1.1% of the BLASTx top-hits matched P. olivaceus protein sequences. Approximately 12,503 (58.45%) unigenes were categorized into three Gene Ontology groups, 19,547 (91.38%) were classified into 26 Cluster of Orthologous Groups, and 10,649 (49.78%) were assigned to six Kyoto Encyclopedia of Genes and Genomes pathways. Furthermore, 40,928 putative simple sequence repeats and 47, 362 putative single nucleotide polymorphisms were identified. Importantly, we identified 1,563 putative immune-associated unigenes that mapped to 15 immune signaling pathways. Conclusions/Significance The P. olivaceus transciptome data provides a rich source to discover and identify new genes, and the immune-relevant sequences identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the plentiful potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the flounder. PMID:25723398
Comparative analyses of putative toxin gene homologs from an Old World viper, Daboia russelii
Krishnan, Neeraja M.
2017-01-01
Availability of snake genome sequences has opened up exciting areas of research on comparative genomics and gene diversity. One of the challenges in studying snake genomes is the acquisition of biological material from live animals, especially from the venomous ones, making the process cumbersome and time-consuming. Here, we report comparative sequence analyses of putative toxin gene homologs from Russell’s viper (Daboia russelii) using whole-genome sequencing data obtained from shed skin. When compared with the major venom proteins in Russell’s viper studied previously, we found 45–100% sequence similarity between the venom proteins and their putative homologs in the skin. Additionally, comparative analyses of 20 putative toxin gene family homologs provided evidence of unique sequence motifs in nerve growth factor (NGF), platelet derived growth factor (PDGF), Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI), cysteine-rich secretory proteins, antigen 5, andpathogenesis-related1 proteins (CAP) and cysteine-rich secretory protein (CRISP). In those derived proteins, we identified V11 and T35 in the NGF domain; F23 and A29 in the PDGF domain; N69, K2 and A5 in the CAP domain; and Q17 in the CRISP domain to be responsible for differences in the largest pockets across the protein domain structures in crotalines, viperines and elapids from the in silico structure-based analysis. Similarly, residues F10, Y11 and E20 appear to play an important role in the protein structures across the kunitz protein domain of viperids and elapids. Our study highlights the usefulness of shed skin in obtaining good quality high-molecular weight DNA for comparative genomic studies, and provides evidence towards the unique features and evolution of putative venom gene homologs in vipers. PMID:29230357
Ramos-González, Pedro Luis; Chabi-Jesus, Camila; Banguela-Castillo, Alexander; Tassi, Aline Daniele; Rodrigues, Mariane da Costa; Kitajima, Elliot Watanabe; Harakava, Ricardo; Freitas-Astúa, Juliana
2018-06-04
The genus Dichorhavirus includes plant-infecting rhabdoviruses with bisegmented genomes that are horizontally transmitted by false spider mites of the genus Brevipalpus. The complete genome sequences of three isolates of the putative dichorhavirus clerodendrum chlorotic spot virus were determined using next-generation sequencing (Illumina) and traditional RT-PCR. Their genome organization, sequence similarity and phylogenetic relationship to other viruses, and transmissibility by Brevipalpus yothersi mites support the assignment of these viruses to a new species of dichorhavirus, as suggested previously. New data are discussed stressing the reliability of the current rules for species demarcation and taxonomic status criteria within the genus Dichorhavirus.
Class I KNOX genes are associated with organogenesis during bulbil formation in Agave tequilana.
Abraham-Juárez, María Jazmín; Martínez-Hernández, Aída; Leyva-González, Marco Antonio; Herrera-Estrella, Luis; Simpson, June
2010-09-01
Bulbil formation in Agave tequilana was analysed with the objective of understanding this phenomenon at the molecular and cellular levels. Bulbils formed 14-45 d after induction and were associated with rearrangements in tissue structure and accelerated cell multiplication. Changes at the cellular level during bulbil development were documented by histological analysis. In addition, several cDNA libraries produced from different stages of bulbil development were generated and partially sequenced. Sequence analysis led to the identification of candidate genes potentially involved in the initiation and development of bulbils in Agave, including two putative class I KNOX genes. Real-time reverse transcription-PCR and in situ hybridization revealed that expression of the putative Agave KNOXI genes occurs at bulbil initiation and specifically in tissue where meristems will develop. Functional analysis of Agave KNOXI genes in Arabidopsis thaliana showed the characteristic lobed phenotype of KNOXI ectopic expression in leaves, although a slightly different phenotype was observed for each of the two Agave genes. An Arabidopsis KNOXI (knat1) mutant line (CS30) was successfully complemented with one of the Agave KNOX genes and partially complemented by the other. Analysis of the expression of the endogenous Arabidopsis genes KNAT1, KNAT6, and AS1 in the transformed lines ectopically expressing or complemented by the Agave KNOX genes again showed different regulatory patterns for each Agave gene. These results show that Agave KNOX genes are functionally similar to class I KNOX genes and suggest that spatial and temporal control of their expression is essential during bulbil formation in A. tequilana.
Armas, Pablo; Margarit, Ezequiel; Mouguelar, Valeria S; Allende, Miguel L; Calcaterra, Nora B
2013-01-01
CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates.
Reid, S D; Green, N M; Buss, J K; Lei, B; Musser, J M
2001-06-19
Species of pathogenic microbes are composed of an array of evolutionarily distinct chromosomal genotypes characterized by diversity in gene content and sequence (allelic variation). The occurrence of substantial genetic diversity has hindered progress in developing a comprehensive understanding of the molecular basis of virulence and new therapeutics such as vaccines. To provide new information that bears on these issues, 11 genes encoding extracellular proteins in the human bacterial pathogen group A Streptococcus identified by analysis of four genomes were studied. Eight of the 11 genes encode proteins with a LPXTG(L) motif that covalently links Gram-positive virulence factors to the bacterial cell surface. Sequence analysis of the 11 genes in 37 geographically and phylogenetically diverse group A Streptococcus strains cultured from patients with different infection types found that recent horizontal gene transfer has contributed substantially to chromosomal diversity. Regions of the inferred proteins likely to interact with the host were identified by molecular population genetic analysis, and Western immunoblot analysis with sera from infected patients confirmed that they were antigenic. Real-time reverse transcriptase-PCR (TaqMan) assays found that transcription of six of the 11 genes was substantially up-regulated in the stationary phase. In addition, transcription of many genes was influenced by the covR and mga trans-acting gene regulatory loci. Multilocus investigation of putative virulence genes by the integrated approach described herein provides an important strategy to aid microbial pathogenesis research and rapidly identify new targets for therapeutics research.
Mouguelar, Valeria S.; Allende, Miguel L.; Calcaterra, Nora B.
2013-01-01
CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates. PMID:23667590
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-02-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-01-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Regulatory sequence analysis tools.
van Helden, Jacques
2003-07-01
The web resource Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat) offers a collection of software tools dedicated to the prediction of regulatory sites in non-coding DNA sequences. These tools include sequence retrieval, pattern discovery, pattern matching, genome-scale pattern matching, feature-map drawing, random sequence generation and other utilities. Alternative formats are supported for the representation of regulatory motifs (strings or position-specific scoring matrices) and several algorithms are proposed for pattern discovery. RSAT currently holds >100 fully sequenced genomes and these data are regularly updated from GenBank.
Iskow, Rebecca C.; Austermann, Christian; Scharer, Christopher D.; Raj, Towfique; Boss, Jeremy M.; Sunyaev, Shamil; Price, Alkes; Stranger, Barbara; Simon, Viviana; Lee, Charles
2013-01-01
Ancient population structure shaping contemporary genetic variation has been recently appreciated and has important implications regarding our understanding of the structure of modern human genomes. We identified a ∼36-kb DNA segment in the human genome that displays an ancient substructure. The variation at this locus exists primarily as two highly divergent haplogroups. One of these haplogroups (the NE1 haplogroup) aligns with the Neandertal haplotype and contains a 4.6-kb deletion polymorphism in perfect linkage disequilibrium with 12 single nucleotide polymorphisms (SNPs) across diverse populations. The other haplogroup, which does not contain the 4.6-kb deletion, aligns with the chimpanzee haplotype and is likely ancestral. Africans have higher overall pairwise differences with the Neandertal haplotype than Eurasians do for this NE1 locus (p<10−15). Moreover, the nucleotide diversity at this locus is higher in Eurasians than in Africans. These results mimic signatures of recent Neandertal admixture contributing to this locus. However, an in-depth assessment of the variation in this region across multiple populations reveals that African NE1 haplotypes, albeit rare, harbor more sequence variation than NE1 haplotypes found in Europeans, indicating an ancient African origin of this haplogroup and refuting recent Neandertal admixture. Population genetic analyses of the SNPs within each of these haplogroups, along with genome-wide comparisons revealed significant FST (p = 0.00003) and positive Tajima's D (p = 0.00285) statistics, pointing to non-neutral evolution of this locus. The NE1 locus harbors no protein-coding genes, but contains transcribed sequences as well as sequences with putative regulatory function based on bioinformatic predictions and in vitro experiments. We postulate that the variation observed at this locus predates Human–Neandertal divergence and is evolving under balancing selection, especially among European populations. PMID:23593015
Understanding Neurodevelopmental Disorders: The Promise of Regulatory Variation in the 3'UTRome.
Wanke, Kai A; Devanna, Paolo; Vernes, Sonja C
2018-04-01
Neurodevelopmental disorders have a strong genetic component, but despite widespread efforts, the specific genetic factors underlying these disorders remain undefined for a large proportion of affected individuals. Given the accessibility of exome sequencing, this problem has thus far been addressed from a protein-centric standpoint; however, protein-coding regions only make up ∼1% to 2% of the human genome. With the advent of whole genome sequencing we are in the midst of a paradigm shift as it is now possible to interrogate the entire sequence of the human genome (coding and noncoding) to fill in the missing heritability of complex disorders. These new technologies bring new challenges, as the number of noncoding variants identified per individual can be overwhelming, making it prudent to focus on noncoding regions of known function, for which the effects of variation can be predicted and directly tested to assess pathogenicity. The 3'UTRome is a region of the noncoding genome that perfectly fulfills these criteria and is of high interest when searching for pathogenic variation related to complex neurodevelopmental disorders. Herein, we review the regulatory roles of the 3'UTRome as binding sites for microRNAs or RNA binding proteins, or during alternative polyadenylation. We detail existing evidence that these regions contribute to neurodevelopmental disorders and outline strategies for identification and validation of novel putatively pathogenic variation in these regions. This evidence suggests that studying the 3'UTRome will lead to the identification of new risk factors, new candidate disease genes, and a better understanding of the molecular mechanisms contributing to neurodevelopmental disorders. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Zhukova, Anna; Fernandes, Luis Guilherme; Hugon, Perrine; Pappas, Christopher J.; Sismeiro, Odile; Coppée, Jean-Yves; Becavin, Christophe; Malabat, Christophe; Eshghi, Azad; Zhang, Jun-Jie; Yang, Frank X.; Picardeau, Mathieu
2017-01-01
Leptospira are emerging zoonotic pathogens transmitted from animals to humans typically through contaminated environmental sources of water and soil. Regulatory pathways of pathogenic Leptospira spp. underlying the adaptive response to different hosts and environmental conditions remains elusive. In this study, we provide the first global Transcriptional Start Site (TSS) map of a Leptospira species. RNA was obtained from the pathogen Leptospira interrogans grown at 30°C (optimal in vitro temperature) and 37°C (host temperature) and selectively enriched for 5′ ends of native transcripts. A total of 2865 and 2866 primary TSS (pTSS) were predicted in the genome of L. interrogans at 30 and 37°C, respectively. The majority of the pTSSs were located between 0 and 10 nucleotides from the translational start site, suggesting that leaderless transcripts are a common feature of the leptospiral translational landscape. Comparative differential RNA-sequencing (dRNA-seq) analysis revealed conservation of most pTSS at 30 and 37°C. Promoter prediction algorithms allow the identification of the binding sites of the alternative sigma factor sigma 54. However, other motifs were not identified indicating that Leptospira consensus promoter sequences are inherently different from the Escherichia coli model. RNA sequencing also identified 277 and 226 putative small regulatory RNAs (sRNAs) at 30 and 37°C, respectively, including eight validated sRNAs by Northern blots. These results provide the first global view of TSS and the repertoire of sRNAs in L. interrogans. These data will establish a foundation for future experimental work on gene regulation under various environmental conditions including those in the host. PMID:28154810
Chen, Muyan; Zhang, Xiumei; Liu, Jianning; Storey, Kenneth B.
2013-01-01
The regulatory role of miRNA in gene expression is an emerging hot new topic in the control of hypometabolism. Sea cucumber aestivation is a complicated physiological process that includes obvious hypometabolism as evidenced by a decrease in the rates of oxygen consumption and ammonia nitrogen excretion, as well as a serious degeneration of the intestine into a very tiny filament. To determine whether miRNAs play regulatory roles in this process, the present study analyzed profiles of miRNA expression in the intestine of the sea cucumber (Apostichopus japonicus), using Solexa deep sequencing technology. We identified 308 sea cucumber miRNAs, including 18 novel miRNAs specific to sea cucumber. Animals sampled during deep aestivation (DA) after at least 15 days of continuous torpor, were compared with animals from a non-aestivation (NA) state (animals that had passed through aestivation and returned to the active state). We identified 42 differentially expressed miRNAs [RPM (reads per million) >10, |FC| (|fold change|) ≥1, FDR (false discovery rate) <0.01] during aestivation, which were validated by two other miRNA profiling methods: miRNA microarray and real-time PCR. Among the most prominent miRNA species, miR-200-3p, miR-2004, miR-2010, miR-22, miR-252a, miR-252a-3p and miR-92 were significantly over-expressed during deep aestivation compared with non-aestivation animals. Preliminary analyses of their putative target genes and GO analysis suggest that these miRNAs could play important roles in global transcriptional depression and cell differentiation during aestivation. High-throughput sequencing data and microarray data have been submitted to GEO database. PMID:24143179
Święcicka, Magdalena; Skowron, Waldemar; Cieszyński, Piotr; Dąbrowska-Bronk, Joanna; Matuszkiewicz, Mateusz; Filipecki, Marcin; Koter, Marek Daniel
2017-04-01
Potato cyst nematode Globodera rostochiensis is an obligate parasite of solanaceous plants, triggering metabolic and morphological changes in roots which may result in substantial crop yield losses. Previously, we used the cDNA-AFLP to study the transcriptional dynamics in nematode infected tomato roots. Now, we present the rescreening of already published, upregulated transcript-derived fragment dataset using the most current tomato transcriptome sequences. Our reanalysis allowed to add 54 novel genes to 135, already found as upregulated in tomato roots upon G. rostochiensis infection (in total - 189). We also created completely new catalogue of downregulated sequences leading to the discovery of 76 novel genes. Functional classification of candidates showed that the 'wound, stress and defence response' category was enriched in the downregulated genes. We confirmed the transcriptional dynamics of six genes by qRT-PCR. To place our results in a broader context, we compared the tomato data with Arabidopsis thaliana, revealing similar proportions of upregulated and downregulated genes as well as similar enrichment of defence related transcripts in the downregulated group. Since transcript suppression is quite common in plant-nematode interactions, we assessed the possibility of miRNA-mediated inverse correlation on several tomato sequences belonging to NB-LRR and receptor-like kinase families. The qRT-PCR of miRNAs and putative target transcripts showed an opposite expression pattern in 9 cases. These results together with in silico analyses of potential miRNA targeting to the full repertoire of tomato R-genes show that miRNA mediated gene suppression may be a key regulatory mechanism during nematode parasitism. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Complete Genome Sequence of an Avian Paramyxovirus Representative of Putative New Serotype 13
Goraichuk, Iryna; Sharma, Poonam; Stegniy, Borys; Muzyka, Denys; Pantin-Jackwood, Mary J.; Gerilovych, Anton; Solodiankin, Olexii; Bolotin, Vitaliy; Miller, Patti J.; Dimitrov, Kiril M.
2016-01-01
Here, we report the complete genome sequence of a virus of a putative new serotype of avian paramyxovirus (APMV). The virus was isolated from a white-fronted goose in Ukraine in 2011 and designated white-fronted goose/Ukraine/Askania-Nova/48-15-02/2011. The genomic characterization of the isolate suggests that it represents the novel avian paramyxovirus group APMV 13. PMID:27469958
Hamilton, Natasha A; Tammen, Imke; Raadsma, Herman W
2013-01-01
Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.
Hamilton, Natasha A.; Tammen, Imke; Raadsma, Herman W.
2013-01-01
Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism. PMID:23408978
Guo, Chuanyu; Cui, Huachun; Ni, Songwei; Yan, Yang; Qin, Qiwei
2015-10-01
microRNAs (miRNAs) are an evolutionarily conserved class of non-coding RNA molecules that participate in various biological processes. Employment of high-throughput screening strategies greatly prompts the investigation and profiling of miRNAs in diverse species. In recent years, grouper (Epinephelus spp.) aquaculture was severely affected by iridoviral diseases. However, knowledge regarding the host immune responses to viral infection, especially the miRNA-mediated immune regulatory roles, is rather limited. In this study, by employing Solexa deep sequencing approach, we identified 116 grouper miRNAs from grouper spleen-derived cells (GS). As expected, these miRNAs shared high sequence similarity with miRNAs identified in zebrafish (Danio rerio), pufferfish (Fugu rubripes), and other higher vertebrates. In the process of Singapore grouper iridovirus (SGIV) infection, 45 and 43 miRNAs with altered expression (>1.5-fold) were identified by miRNA microarray assays in grouper spleen tissues and GS cells, respectively. Furthermore, target prediction revealed 189 putative targets of these grouper miRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.
Discovery and molecular characterization of a novel enamovirus, Grapevine enamovirus-1.
Silva, João Marcos Fagundes; Al Rwahnih, Maher; Blawid, Rosana; Nagata, Tatsuya; Fajardo, Thor Vinícius Martins
2017-08-01
In this study, we describe a novel putative Enamovirus member, Grapevine enamovirus-1 (GEV-1), discovered by high-throughput sequencing (HTS). A limited survey using HTS of 17 grapevines (Vitis spp.) from the south, southeast, and northeast regions of Brazil led to the detection of GEV-1 exclusively on southern plants, infecting four grapevine cultivars (Cabernet Sauvignon, Semillon, CG 90450, and Cabernet franc) with a remarkable identity of around 99% at the nucleotide level. This novel virus was only detected in multiple-virus infected plants exhibiting viral-like symptoms. GEV-1 was also detected on a cv. Malvasia Longa by RT-PCR. We performed graft-transmissibility assays on GEV-1. The organization, products, and cis-acting regulatory elements of GEV-1 genome are also discussed here. The near complete genome sequence of GEV-1 was obtained during the course of this study, lacking only part of the 3' untranslated terminal region. This is the first report of a virus in the family Luteoviridae infecting grapevines. Based on its genomic properties and phylogenetic analyses, GEV-1 should be classified as a new member of the genus Enamovirus.
Yang, Zhirong; Patra, Barunava; Li, Runzhi; Pattanaik, Sitakanta; Yuan, Ling
2013-12-01
WRKY transcription factors (TFs) are emerging as an important group of regulators of plant secondary metabolism. However, the cis-regulatory elements associated with their regulation have not been well characterized. We have previously demonstrated that CrWRKY1, a member of subgroup III of the WRKY TF family, regulates biosynthesis of terpenoid indole alkaloids in the ornamental and medicinal plant, Catharanthus roseus. Here, we report the isolation and functional characterization of the CrWRKY1 promoter. In silico analysis of the promoter sequence reveals the presence of several potential TF binding motifs, indicating the involvement of additional TFs in the regulation of the TIA pathway. The CrWRKY1 promoter can drive the expression of a β-glucuronidase (GUS) reporter gene in native (C. roseus protoplasts and transgenic hairy roots) and heterologous (transgenic tobacco seedlings) systems. Analysis of 5'- or 3'-end deletions indicates that the sequence located between positions -140 to -93 bp and -3 to +113 bp, relative to the transcription start site, is critical for promoter activity. Mutation analysis shows that two overlapping as-1 elements and a CT-rich motif contribute significantly to promoter activity. The CrWRKY1 promoter is induced in response to methyl jasmonate (MJ) treatment and the promoter region between -230 and -93 bp contains a putative MJ-responsive element. The CrWRKY1 promoter can potentially be used as a tool to isolate novel TFs involved in the regulation of the TIA pathway.
Binding of the Ras activator son of sevenless to insulin receptor substrate-1 signaling complexes.
Baltensperger, K; Kozma, L M; Cherniack, A D; Klarlund, J K; Chawla, A; Banerjee, U; Czech, M P
1993-06-25
Signal transmission by insulin involves tyrosine phosphorylation of a major insulin receptor substrate (IRS-1) and exchange of Ras-bound guanosine diphosphate for guanosine triphosphate. Proteins containing Src homology 2 and 3 (SH2 and SH3) domains, such as the p85 regulatory subunit of phosphatidylinositol-3 kinase and growth factor receptor-bound protein 2 (GRB2), bind tyrosine phosphate sites on IRS-1 through their SH2 regions. Such complexes in COS cells were found to contain the heterologously expressed putative guanine nucleotide exchange factor encoded by the Drosophila son of sevenless gene (dSos). Thus, GRB2, p85, or other proteins with SH2-SH3 adapter sequences may link Sos proteins to IRS-1 signaling complexes as part of the mechanism by which insulin activates Ras.
Arrhythmogenic KCNE gene variants: current knowledge and future challenges
Crump, Shawn M.; Abbott, Geoffrey W.
2014-01-01
There are twenty-five known inherited cardiac arrhythmia susceptibility genes, all of which encode either ion channel pore-forming subunits or proteins that regulate aspects of ion channel biology such as function, trafficking, and localization. The human KCNE gene family comprises five potassium channel regulatory subunits, sequence variants in each of which are associated with cardiac arrhythmias. KCNE gene products exhibit promiscuous partnering and in some cases ubiquitous expression, hampering efforts to unequivocally correlate each gene to specific native potassium currents. Likewise, deducing the molecular etiology of cardiac arrhythmias in individuals harboring rare KCNE gene variants, or more common KCNE polymorphisms, can be challenging. In this review we provide an update on putative arrhythmia-causing KCNE gene variants, and discuss current thinking and future challenges in the study of molecular mechanisms of KCNE-associated cardiac rhythm disturbances. PMID:24478792
SNP discovery by high-throughput sequencing in soybean
2010-01-01
Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770
Sequence variability of Campylobacter temperate bacteriophages
Clark, Clifford G; Ng, Lai-King
2008-01-01
Background Prophages integrated within the chromosomes of Campylobacter jejuni isolates have been demonstrated very recently. Prior work with Campylobacter temperate bacteriophages, as well as evidence from prophages in other enteric bacteria, suggests these prophages might have a role in the biology and virulence of the organism. However, very little is known about the genetic variability of Campylobacter prophages which, if present, could lead to differential phenotypes in isolates carrying the phages versus those that do not. As a first step in the characterization of C. jejuni prophages, we investigated the distribution of prophage DNA within a C. jejuni population assessed the DNA and protein sequence variability within a subset of the putative prophages found. Results Southern blotting of C. jejuni DNA using probes from genes within the three putative prophages of the C. jejuni sequenced strain RM 1221 demonstrated the presence of at least one prophage gene in a large proportion (27/35) of isolates tested. Of these, 15 were positive for 5 or more of the 7 Campylobacter Mu-like phage 1 (CMLP 1, also designated Campylobacter jejuni integrated element 1, or CJIE 1) genes tested. Twelve of these putative prophages were chosen for further analysis. DNA sequencing of a 9,000 to 11,000 nucleotide region of each prophage demonstrated a close homology with CMLP 1 in both gene order and nucleotide sequence. Structural and sequence variability, including short insertions, deletions, and allele replacements, were found within the prophage genomes, some of which would alter the protein products of the ORFs involved. No insertions of novel genes were detected within the sequenced regions. The 12 prophages and RM 1221 had a % G+C very similar to C. jejuni sequenced strains, as well as promoter regions characteristic of C. jejuni. None of the putative prophages were successfully induced and propagated, so it is not known if they were functional or if they represented remnant prophage DNA in the bacterial chromosomes. Conclusion These putative prophages form a family of phages with conserved sequences, and appear to be adapted to Campylobacter. There was evidence for recombination among groups of prophages, suggesting that the prophages had a mosaic structure. In many of these properties, the Mu-like CMLP 1 homologs characterized in this study resemble temperate bacteriophages of enteric bacteria that are responsible for contributions to virulence and host adaptation. PMID:18366706
Glinsky, Gennadi V.
2015-01-01
Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements. PMID:25956794
Basu, Swaraj; Larsson, Erik
2018-05-31
Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
Insights into the innate immunity of the Mediterranean mussel Mytilus galloprovincialis
2011-01-01
Background Sessile bivalves of the genus Mytilus are suspension feeders relatively tolerant to a wide range of environmental changes, used as sentinels in ecotoxicological investigations and marketed worldwide as seafood. Mortality events caused by infective agents and parasites apparently occur less in mussels than in other bivalves but the molecular basis of such evidence is unknown. The arrangement of Mytibase, interactive catalogue of 7,112 transcripts of M. galloprovincialis, offered us the opportunity to look for gene sequences relevant to the host defences, in particular the innate immunity related genes. Results We have explored and described the Mytibase sequence clusters and singletons having a putative role in recognition, intracellular signalling, and neutralization of potential pathogens in M. galloprovincialis. Automatically assisted searches of protein signatures and manually cured sequence analysis confirmed the molecular diversity of recognition/effector molecules such as the antimicrobial peptides and many carbohydrate binding proteins. Molecular motifs identifying complement C1q, C-type lectins and fibrinogen-like transcripts emerged as the most abundant in the Mytibase collection whereas, conversely, sequence motifs denoting the regulatory cytokine MIF and cytokine-related transcripts represent singular and unexpected findings. Using a cross-search strategy, 1,820 putatively immune-related sequences were selected to design oligonucleotide probes and define a species-specific Immunochip (DNA microarray). The Immunochip performance was tested with hemolymph RNAs from mussels injected with Vibrio splendidus at 3 and 48 hours post-treatment. A total of 143 and 262 differentially expressed genes exemplify the early and late hemocyte response of the Vibrio-challenged mussels, respectively, with AMP trends confirmed by qPCR and clear modulation of interrelated signalling pathways. Conclusions The Mytibase collection is rich in gene transcripts modulated in response to antigenic stimuli and represents an interesting window for looking at the mussel immunome (transcriptomes mediating the mussel response to non-self or abnormal antigens). On this basis, we have defined a new microarray platform, a mussel Immunochip, as a flexible tool for the experimental validation of immune-candidate sequences, and tested its performance on Vibrio-activated mussel hemocytes. The microarray platform and related expression data can be regarded as a step forward in the study of the adaptive response of the Mytilus species to an evolving microbial world. PMID:21269501
Motohashi, Hozumi; O'Connor, Tania; Katsuoka, Fumiki; Engel, James Douglas; Yamamoto, Masayuki
2002-07-10
Recent progress in the analysis of transcriptional regulation has revealed the presence of an exquisite functional network comprising the Maf and Cap 'n' collar (CNC) families of regulatory proteins, many of which have been isolated. Among Maf factors, large Maf proteins are important in the regulation of embryonic development and cell differentiation, whereas small Maf proteins serve as obligatory heterodimeric partner molecules for members of the CNC family. Both Maf homodimers and CNC-small Maf heterodimers bind to the Maf recognition element (MARE). Since the MARE contains a consensus TRE sequence recognized by AP-1, Jun and Fos family members may act to compete or interfere with the function of CNC-small Maf heterodimers. Overall then, the quantitative balance of transcription factors interacting with the MARE determines its transcriptional activity. Many putative MARE-dependent target genes such as those induced by antioxidants and oxidative stress are under concerted regulation by the CNC family member Nrf2, as clearly proven by mouse germline mutagenesis. Since these genes represent a vital aspect of the cellular defense mechanism against oxidative stress, Nrf2-null mutant mice are highly sensitive to xenobiotic and oxidative insults. Deciphering the molecular basis of the regulatory network composed of Maf and CNC families of transcription factors will undoubtedly lead to a new paradigm for the cooperative function of transcription factors.
Complete Genome Sequence of an Avian Paramyxovirus Representative of Putative New Serotype 13.
Goraichuk, Iryna; Sharma, Poonam; Stegniy, Borys; Muzyka, Denys; Pantin-Jackwood, Mary J; Gerilovych, Anton; Solodiankin, Olexii; Bolotin, Vitaliy; Miller, Patti J; Dimitrov, Kiril M; Afonso, Claudio L
2016-07-28
Here, we report the complete genome sequence of a virus of a putative new serotype of avian paramyxovirus (APMV). The virus was isolated from a white-fronted goose in Ukraine in 2011 and designated white-fronted goose/Ukraine/Askania-Nova/48-15-02/2011. The genomic characterization of the isolate suggests that it represents the novel avian paramyxovirus group APMV 13. Copyright © 2016 Goraichuk et al.
Ciok, Anna; Adamczuk, Marcin; Bartosik, Dariusz; Dziewit, Lukasz
2016-11-28
Pseudomonas strains isolated from the heavily contaminated Lubin copper mine and Zelazny Most post-flotation waste reservoir in Poland were screened for the presence of integrons. This analysis revealed that two strains carried homologous DNA regions composed of a gene encoding a DNA_BRE_C domain-containing tyrosine recombinase (with no significant sequence similarity to other integrases of integrons) plus a three-component array of putative integron gene cassettes. The predicted gene cassettes encode three putative polypeptides with homology to (i) transmembrane proteins, (ii) GCN5 family acetyltransferases, and (iii) hypothetical proteins of unknown function (homologous proteins are encoded by the gene cassettes of several class 1 integrons). Comparative sequence analyses identified three structural variants of these novel integron-like elements within the sequenced bacterial genomes. Analysis of their distribution revealed that they are found exclusively in strains of the genus Pseudomonas .
Urano, Y; Kominami, R; Mishima, Y; Muramatsu, M
1980-01-01
Approximately one kilobase pairs surrounding and upstream the transcription initiation site of a cloned ribosomal DNA (rDNA) of the mouse were sequenced. The putative transcription initiation site was determined by two independent methods: one nuclease S1 protection and the other reverse transcriptase elongation mapping using isolated 45S ribosomal RNA precursor (45S RNA) and appropriate restriction fragments of rDNA. Both methods gave an identical result; 45S RNA had a structure starting from ACTCTTAG---. Characteristically, mouse rDNA had many T clusters (greater than or equal to 5) upstream the initiation site, the longest being 21 consecutive T's. A pentadecanucleotide, TGCCTCCCGAGTGCA, appeared twice within 260 nucleotides upstream the putative initiation site. No such characteristic sequences were found downstream this site. Little similarity was found in the upstream of the transcription initiation site between the mouse, Xenopus laevis and Saccharomyces cerevisiae rDNA. Images PMID:6162156
Doszpoly, Andor; Papp, Melitta; Deákné, Petra P; Glávits, Róbert; Ursu, Krisztina; Dán, Ádám
2015-05-01
In the early summer of 2014, mass mortality of sichel (Pelecus cultratus) was observed in Lake Balaton, Hungary. Histological examination revealed degenerative changes within the tubular epithelium, mainly in the distal tubules and collecting ducts in the kidneys and multifocal vacuolisation in the brain stem and cerebellum. Routine molecular investigations showed the presence of the DNA of an unknown alloherpesvirus in some specimens. Subsequently, three genes of the putative herpesviral genome (DNA polymerase, terminase, and helicase) were amplified and partially sequenced. A phylogenetic tree reconstruction based on the concatenated sequence of these three conserved genes implied that the virus belongs to the genus Cyprinivirus within the family Alloherpesviridae. The sequences of the sichel herpesvirus differ markedly from those of the cypriniviruses CyHV-1, CyHV-2 and CyHV-3, putatively representing a fifth species in the genus.
Cloning, sequencing, and characterization of the Bacillus subtilis biotin biosynthetic operon.
Bower, S; Perkins, J B; Yocum, R R; Howitt, C L; Rahaim, P; Pero, J
1996-07-01
A 10-kb region of the Bacillus subtilis genome that contains genes involved in biotin-biosynthesis was cloned and sequenced. DNA sequence analysis indicated that B. subtilis contains homologs of the Escherichia coli and Bacillus sphaericus bioA, bioB, bioD, and bioF genes. These four genes and a homolog of the B. sphaericus bioW gene are arranged in a single operon in the order bioWAFDR and are followed by two additional genes, bioI and orf2. bioI and orf2 show no similarity to any other known biotin biosynthetic genes. The bioI gene encodes a protein with similarity to cytochrome P-450s and was able to complement mutations in either bioC or bioH of E. coli. Mutations in bioI caused B. subtilis to grow poorly in the absence of biotin. The bradytroph phenotype of bioI mutants was overcome by pimelic acid, suggesting that the product of bioI functions at a step prior to pimelic acid synthesis. The B. subtilis bio operon is preceded by a putative vegetative promoter sequence and contains just downstream a region of dyad symmetry with homology to the bio regulatory region of B. sphaericus. Analysis of a bioW-lacZ translational fusion indicated that expression of the biotin operon is regulated by biotin and the B. subtilis birA gene.
Cloning, sequencing, and characterization of the Bacillus subtilis biotin biosynthetic operon.
Bower, S; Perkins, J B; Yocum, R R; Howitt, C L; Rahaim, P; Pero, J
1996-01-01
A 10-kb region of the Bacillus subtilis genome that contains genes involved in biotin-biosynthesis was cloned and sequenced. DNA sequence analysis indicated that B. subtilis contains homologs of the Escherichia coli and Bacillus sphaericus bioA, bioB, bioD, and bioF genes. These four genes and a homolog of the B. sphaericus bioW gene are arranged in a single operon in the order bioWAFDR and are followed by two additional genes, bioI and orf2. bioI and orf2 show no similarity to any other known biotin biosynthetic genes. The bioI gene encodes a protein with similarity to cytochrome P-450s and was able to complement mutations in either bioC or bioH of E. coli. Mutations in bioI caused B. subtilis to grow poorly in the absence of biotin. The bradytroph phenotype of bioI mutants was overcome by pimelic acid, suggesting that the product of bioI functions at a step prior to pimelic acid synthesis. The B. subtilis bio operon is preceded by a putative vegetative promoter sequence and contains just downstream a region of dyad symmetry with homology to the bio regulatory region of B. sphaericus. Analysis of a bioW-lacZ translational fusion indicated that expression of the biotin operon is regulated by biotin and the B. subtilis birA gene. PMID:8763940
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).
Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar
2016-12-01
In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Arabidopsis intragenomic conserved noncoding sequence
Thomas, Brian C.; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Freeling, Michael
2007-01-01
After the most recent tetraploidy in the Arabidopsis lineage, most gene pairs lost one, but not both, of their duplicates. We manually inspected the 3,179 retained gene pairs and their surrounding gene space still present in the genome using a custom-made viewer application. The display of these pairs allowed us to define intragenic conserved noncoding sequences (CNSs), identify exon annotation errors, and discover potentially new genes. Using a strict algorithm to sort high-scoring pair sequences from the bl2seq data, we created a database of 14,944 intragenomic Arabidopsis CNSs. The mean CNS length is 31 bp, ranging from 15 to 285 bp. There are ≈1.7 CNSs associated with a typical gene, and Arabidopsis CNSs are found in all areas around exons, most frequently in the 5′ upstream region. Gene ontology classifications related to transcription, regulation, or “response to …” external or endogenous stimuli, especially hormones, tend to be significantly overrepresented among genes containing a large number of CNSs, whereas protein localization, transport, and metabolism are common among genes with no CNSs. There is a 1.5% overlap between these CNSs and the 218,982 putative RNAs in the Arabidopsis Small RNA Project database, allowing for two mismatches. These CNSs provide a unique set of noncoding sequences enriched for function. CNS function is implied by evolutionary conservation and independently supported because CNS-richness predicts regulatory gene ontology categories. PMID:17301222
Martín, Juan F; Rodríguez-García, Antonio; Liras, Paloma
2017-05-01
Phosphate limitation is important for production of antibiotics and other secondary metabolites in Streptomyces. Phosphate control is mediated by the two-component system PhoR-PhoP. Following phosphate depletion, PhoP stimulates expression of genes involved in scavenging, transport and mobilization of phosphate, and represses the utilization of nitrogen sources. PhoP reduces expression of genes for aerobic respiration and activates nitrate respiration genes. PhoP activates genes for teichuronic acid formation and reduces expression of genes for phosphate-rich teichoic acid biosynthesis. In Streptomyces coelicolor, PhoP repressed several differentiation and pleiotropic regulatory genes, which affects development and indirectly antibiotic biosynthesis. A new bioinformatics analysis of the putative PhoP-binding sequences in Streptomyces avermitilis was made. Many sequences in S. avermitilis genome showed high weight values and were classified according to the available genetic information. These genes encode phosphate scavenging proteins, phosphate transporters and nitrogen metabolism genes. Among of the genes highlighted in the new studies was aveR, located in the avermectin gene cluster, encoding a LAL-type regulator, and afsS, which is regulated by PhoP and AfsR. The sequence logo for S. avermitilis PHO boxes is similar to that of S. coelicolor, with differences in the weight value for specific nucleotides in the sequence.
Konami, Y; Yamamoto, K; Osawa, T; Irimura, T
1995-04-01
The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).
Torres-Cortés, Gloria; Ghignone, Stefano; Bonfante, Paola; Schüßler, Arthur
2015-06-23
For more than 450 million years, arbuscular mycorrhizal fungi (AMF) have formed intimate, mutualistic symbioses with the vast majority of land plants and are major drivers in almost all terrestrial ecosystems. The obligate plant-symbiotic AMF host additional symbionts, so-called Mollicutes-related endobacteria (MRE). To uncover putative functional roles of these widespread but yet enigmatic MRE, we sequenced the genome of DhMRE living in the AMF Dentiscutata heterogama. Multilocus phylogenetic analyses showed that MRE form a previously unidentified lineage sister to the hominis group of Mycoplasma species. DhMRE possesses a strongly reduced metabolic capacity with 55% of the proteins having unknown function, which reflects unique adaptations to an intracellular lifestyle. We found evidence for transkingdom gene transfer between MRE and their AMF host. At least 27 annotated DhMRE proteins show similarities to nuclear-encoded proteins of the AMF Rhizophagus irregularis, which itself lacks MRE. Nuclear-encoded homologs could moreover be identified for another AMF, Gigaspora margarita, and surprisingly, also the non-AMF Mortierella verticillata. Our data indicate a possible origin of the MRE-fungus association in ancestors of the Glomeromycota and Mucoromycotina. The DhMRE genome encodes an arsenal of putative regulatory proteins with eukaryotic-like domains, some of them encoded in putative genomic islands. MRE are highly interesting candidates to study the evolution and interactions between an ancient, obligate endosymbiotic prokaryote with its obligate plant-symbiotic fungal host. Our data moreover may be used for further targeted searches for ancient effector-like proteins that may be key components in the regulation of the arbuscular mycorrhiza symbiosis.
Sharan, Malvika; Förstner, Konrad U; Eulalio, Ana; Vogel, Jörg
2017-06-20
RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Molecular Diagnosis of Putative Stargardt Disease by Capture Next Generation Sequencing
Shi, Wei; Huang, Ping; Min, Qingjie; Li, Minghan; Yu, Xinping; Wu, Yaming; Zhao, Guangyu; Tong, Yi; Jin, Zi-Bing; Qu, Jia; Gu, Feng
2014-01-01
Stargardt Disease (STGD) is the commonest genetic form of juvenile or early adult onset macular degeneration, which is a genetically heterogeneous disease. Molecular diagnosis of STGD remains a challenge in a significant proportion of cases. To address this, seven patients from five putative STGD families were recruited. We performed capture next generation sequencing (CNGS) of the probands and searched for potentially disease-causing genetic variants in previously identified retinal or macular dystrophy genes. Seven disease-causing mutations in ABCA4 and two in PROM1 were identified by CNGS, which provides a confident genetic diagnosis in these five families. We also provided a genetic basis to explain the differences among putative STGD due to various mutations in different genes. Meanwhile, we show for the first time that compound heterozygous mutations in PROM1 gene could cause cone-rod dystrophy. Our findings support the enormous potential of CNGS in putative STGD molecular diagnosis. PMID:24763286
Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V
2017-02-07
Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
Miguel, Célia; Simões, Marta; Oliveira, Maria Margarida; Rocheta, Margarida
2008-11-01
Retroviruses differ from retrotransposons due to their infective capacity, which depends critically on the encoded envelope. Some plant retroelements contain domains reminiscent of the env of animal retroviruses but the number of such elements described to date is restricted to angiosperms. We show here the first evidence of the presence of putative env-like gene sequences in a gymnosperm species, Pinus pinaster (maritime pine). Using a degenerate primer approach for conserved domains of RNaseH gene, three clones from putative envelope-like retrotransposons (PpRT2, PpRT3, and PpRT4) were identified. The env-like sequences of P. pinaster clones are predicted to encode proteins with transmembrane domains. These sequences showed identity scores of up to 30% with env-like sequences belonging to different organisms. A phylogenetic analysis based on protein alignment of deduced aminoacid sequences revealed that these clones clustered with env-containing plant retrotransposons, as well as with retrotransposons from invertebrate organisms. The differences found among the sequences of maritime pine clones isolated here suggest the existence of different putative classes of env-like retroelements. The identification for the first time of env-like genes in a gymnosperm species may support the ancestrality of retroviruses among plants shedding light on their role in plant evolution.
Wang, Wei; Liu, Ji-Hong
2015-01-25
Polyamine oxidases (PAOs) are FAD-dependent enzymes associated with polyamine catabolism. In plants, increasing evidences support that PAO genes play essential roles in abiotic and biotic stresses response. In this study, six putative PAO genes (CsPAO1-CsPAO6) were unraveled in sweet orange (Citrus sinensis) using the released citrus genome sequences. A total of 203 putative cis-regulatory elements involved in hormone and stress response were predicted in 1.5-kb promoter regions at the upstream of CsPAOs. The CsPAOs can be divided into four major groups, with similar organizations with their counterparts of Arabidopsis thaliana. Transcripts of CsPAOs were detected in leaf, stem, cotyledon, and root, with the highest levels detected in the roots. The CsPAOs displayed various responses to exogenous treatments with polyamines and ABA and were differentially altered by abiotic stresses, including cold, salt, and mannitol. Overexpression of CsPAO3 in tobacco demonstrated that spermidine and spermine were decreased in the transgenic line, while putrescine was significantly enhanced, implying a potential role of this gene in polyamine back conversion. These data provide valuable knowledge for understanding the roles of the PAO genes in the future. Copyright © 2014 Elsevier B.V. All rights reserved.
Signaling coupled epigenomic regulation of gene expression.
Kumar, R; Deivendran, S; Santhoshkumar, T R; Pillai, M R
2017-10-26
Inheritance of genomic information independent of the DNA sequence, the epigenetics, as well as gene transcription are profoundly shaped by serine/threonine and tyrosine signaling kinases and components of the chromatin remodeling complexes. To precisely respond to a changing external milieu, human cells efficiently translate upstream signals into post-translational modifications (PTMs) on histones and coregulators such as corepressors, coactivators, DNA-binding factors and PTM modifying enzymes. Because a protein with multiple residues for putative PTMs is expected to undergo more than one PTM in cells stimulated with growth factors, the outcome of combinational PTM codes on histones and coregulators is profoundly shaped by regulatory interplays between PTMs. The genomic functions of signaling kinases in cancer cells are manifested by the downstream effectors of cytoplasmic signaling cascades as well as translocation of the cytoplasmic signaling kinases to the nucleus. Signaling-mediated phosphorylation of histones serves as a regulatory switch for other PTMs, and connects chromatin remodeling complexes into gene transcription and gene activity. Here, we will discuss the recent advances in signaling-dependent epigenomic regulation of gene transcription using a few representative cancer-relevant serine/threonine and tyrosine kinases and their interplay with chromatin remodeling factors in cancer cells.
Kadowaki, Marco A S; Müller-Santos, Marcelo; Rego, Fabiane G M; Souza, Emanuel M; Yates, Marshall G; Monteiro, Rose A; Pedrosa, Fabio O; Chubatsu, Leda S; Steffens, Maria B R
2011-10-14
Herbaspirillum seropedicae SmR1 is a nitrogen fixing endophyte associated with important agricultural crops. It produces polyhydroxybutyrate (PHB) which is stored intracellularly as granules. However, PHB metabolism and regulatory control is not yet well studied in this organism. In this work we describe the characterization of the PhbF protein from H. seropedicae SmR1 which was purified and characterized after expression in E. coli. The purified PhbF protein was able to bind to eleven putative promoters of genes involved in PHB metabolism in H. seropedicae SmR1. In silico analyses indicated a probable DNA-binding sequence which was shown to be protected in DNA footprinting assays using purified PhbF. Analyses using lacZ fusions showed that PhbF can act as a repressor protein controlling the expression of PHB metabolism-related genes. Our results indicate that H. seropedicae SmR1 PhbF regulates expression of phb-related genes by acting as a transcriptional repressor. The knowledge of the PHB metabolism of this plant-associated bacterium may contribute to the understanding of the plant-colonizing process and the organism's resistance and survival in planta.
Chowdhury, Shomeek; Zhang, Jian; Kurgan, Lukasz
2018-05-28
Deciphering a complete landscape of protein-RNA interactions in the human proteome remains an elusive challenge. We computationally elucidate RNA binding proteins (RBPs) using an approach that complements previous efforts. We employ two modern complementary sequence-based methods that provide accurate predictions from the structured and the intrinsically disordered sequences, even in the absence of sequence similarity to the known RBPs. We generate and analyze putative RNA binding residues on the whole proteome scale. Using a conservative setting that ensures low, 5% false positive rate, we identify 1511 putative RBPs that include 281 known RBPs and 166 RBPs that were previously predicted. We empirically demonstrate that these overlaps are statistically significant. We also validate the putative RBPs based on two major hallmarks of their RNA binding residues: high levels of evolutionary conservation and enrichment in charged amino acids. Moreover, we show that the novel RBPs are significantly under-annotated functionally which coincides with the fact that they were not yet found to interact with RNAs. We provide two examples of our novel putative RBPs for which there is recent evidence of their interactions with RNAs. The dataset of novel putative RBPs and RNA binding residues for the future hypothesis generation is provided in the Supporting Information. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Itoh, S; Abe, Y; Kubo, A; Okuda, M; Shimoji, M; Nakayama, K; Kamataki, T
1997-02-07
An 11.5 kb fragment of the mouse Cyp3a16 gene containing the 5' flanking region was isolated from the lambda DASHII mouse genomic library. A part of the 5' flanking region and the first exon of Cyp3a16 gene were sequenced. S1 mapping analysis showed the presence of two transcriptional initiation sites. The first exon was completely identical to Cyp3a16 cDNA. The identity of 5' flanking sequences between Cyp3a16 and Cyp3a11 genes was about 69%. A typical TATA box and a basic transcription element (BTE) were found as seen with other CYP3A genes from various animal species Moreover, some putative transcriptional regulatory elements were also found in addition to the sequence motif seen for the formation of Z-type DNA. To examine the transcriptional activity of Cyp3a11 gene, DNA fragments in the 5'-flanking region of the gene were inserted front of the luciferase structural gene, and the constructs were transfected in primary hepatocytes. The analysis of the luciferase activity indicated that the region between -146 and -56 was necessary for the transcription of CYP3a16 gene.
Schjørring, Susanne; Stegger, Marc; Kjelsø, Charlotte; Lilje, Berit; Bangsborg, Jette M; Petersen, Randi F; David, Sophia; Uldum, Søren A
2017-01-01
Between July and November 2014, 15 community-acquired cases of Legionnaires´ disease (LD), including four with Legionella pneumophila serogroup 1 sequence type (ST) 82, were diagnosed in Northern Zealand, Denmark. An outbreak was suspected. No ST82 isolates were found in environmental samples and no external source was established. Four putative-outbreak ST82 isolates were retrospectively subjected to whole genome sequencing (WGS) followed by phylogenetic analyses with epidemiologically unrelated ST82 sequences. The four putative-outbreak ST82 sequences fell into two clades, the two clades were separated by ca 1,700 single nt polymorphisms (SNP)s when recombination regions were included but only by 12 to 21 SNPs when these were removed. A single putative-outbreak ST82 isolate sequence segregated in the first clade. The other three clustered in the second clade, where all included sequences had < 5 SNP differences between them. Intriguingly, this clade also comprised epidemiologically unrelated isolate sequences from the UK and Denmark dating back as early as 2011. The study confirms that recombination plays a major role in L. pneumophila evolution. On the other hand, strains belonging to the same ST can have only few SNP differences despite being sampled over both large timespans and geographic distances. These are two important factors to consider in outbreak investigations. PMID:28662761
The noncoding human genome and the future of personalised medicine.
Cowie, Philip; Hay, Elizabeth A; MacKenzie, Alasdair
2015-01-30
Non-coding cis-regulatory sequences act as the 'eyes' of the genome and their role is to perceive, organise and relay cellular communication information to RNA polymerase II at gene promoters. The evolution of these sequences, that include enhancers, silencers, insulators and promoters, has progressed in multicellular organisms to the extent that cis-regulatory sequences make up as much as 10% of the human genome. Parallel evidence suggests that 75% of polymorphisms associated with heritable disease occur within predicted cis-regulatory sequences that effectively alter the 'perception' of cis-regulatory sequences or render them blind to cell communication cues. Cis-regulatory sequences also act as major functional targets of epigenetic modification thus representing an important conduit through which changes in DNA-methylation affects disease susceptibility. The objectives of the current review are (1) to describe what has been learned about identifying and characterising cis-regulatory sequences since the sequencing of the human genome; (2) to discuss their role in interpreting cell signalling pathways pathways; and (3) outline how this role may be altered by polymorphisms and epigenetic changes. We argue that the importance of the cis-regulatory genome for the interpretation of cellular communication pathways cannot be overstated and understanding its role in health and disease will be critical for the future development of personalised medicine.
Bain, Peter A; Papanicolaou, Alexie; Kumar, Anupama
2015-01-01
Murray-Darling rainbowfish (Melanotaenia fluviatilis [Castelnau, 1878]; Atheriniformes: Melanotaeniidae) is a small-bodied teleost currently under development in Australasia as a test species for aquatic toxicological studies. To date, efforts towards the development of molecular biomarkers of contaminant exposure have been hindered by the lack of available sequence data. To address this, we sequenced messenger RNA from brain, liver and gonads of mature male and female fish and generated a high-quality draft transcriptome using a de novo assembly approach. 149,742 clusters of putative transcripts were obtained, encompassing 43,841 non-redundant protein-coding regions. Deduced amino acid sequences were annotated by functional inference based on similarity with sequences from manually curated protein sequence databases. The draft assembly contained protein-coding regions homologous to 95.7% of the complete cohort of predicted proteins from the taxonomically related species, Oryzias latipes (Japanese medaka). The mean length of rainbowfish protein-coding sequences relative to their medaka homologues was 92.1%, indicating that despite the limited number of tissues sampled a large proportion of the total expected number of protein-coding genes was captured in the study. Because of our interest in the effects of environmental contaminants on endocrine pathways, we manually curated subsets of coding regions for putative nuclear receptors and steroidogenic enzymes in the rainbowfish transcriptome, revealing 61 candidate nuclear receptors encompassing all known subfamilies, and 41 putative steroidogenic enzymes representing all major steroidogenic enzymes occurring in teleosts. The transcriptome presented here will be a valuable resource for researchers interested in biomarker development, protein structure and function, and contaminant-response genomics in Murray-Darling rainbowfish.
Are plant formins integral membrane proteins?
Cvrcková, F
2000-01-01
The formin family of proteins has been implicated in signaling pathways of cellular morphogenesis in both animals and fungi; in the latter case, at least, they participate in communication between the actin cytoskeleton and the cell surface. Nevertheless, they appear to be cytoplasmic or nuclear proteins, and it is not clear whether they communicate with the plasma membrane, and if so, how. Because nothing is known about formin function in plants, I performed a systematic search for putative Arabidopsis thaliana formin homologs. I found eight putative formin-coding genes in the publicly available part of the Arabidopsis genome sequence and analyzed their predicted protein sequences. Surprisingly, some of them lack parts of the conserved formin-homology 2 (FH2) domain and the majority of them seem to have signal sequences and putative transmembrane segments that are not found in yeast or animals formins. Plant formins define a distinct subfamily. The presence in most Arabidopsis formins of sequence motifs typical or transmembrane proteins suggests a mechanism of membrane attachment that may be specific to plant formins, and indicates an unexpected evolutionary flexibility of the conserved formin domain.
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism
Yin, Ling; An, Yunhe; Qu, Junjie; Li, Xinlong; Zhang, Yali; Dry, Ian; Wu, Huijuan; Lu, Jiang
2017-01-01
Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3 Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host. PMID:28417959
Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina; ...
2015-11-10
Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina
Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
Lin, Shao-Yu; Chooi, Yit-Heng; Solomon, Peter S
2018-05-03
To investigate effector gene regulation in the wheat pathogenic fungus Parastagonospora nodorum, the promoter and expression of Tox3 was characterised through a series of complementary approaches. Promoter deletion and DNase I footprinting experiments identified a 25 bp region in the Tox3 promoter as being required for transcription. Subsequent yeast one-hybrid analysis using the DNA sequence as bait identified that interacting partner as the C2H2 zinc finger transcription factor PnCon7, a putative master regulator of pathogenesis. Silencing of PnCon7 resulted in the down-regulation of Tox3 demonstrating that the transcription factor has a positive regulatory role on gene expression. Analysis of Tox3 expression in the PnCon7 silenced strains revealed a strong correlation with PnCon7 transcript levels, supportive of a direct regulatory role. Subsequent pathogenicity assays using PnCon7-silenced isolates revealed that the transcription factor was required for Tox3-mediated disease. The expression of two other necrotrophic effectors (ToxA and Tox1) was also affected but in a non-dose dependent manner suggesting that the regulatory role of PnCon7 on these genes was indirect. Collectively, these data have advanced our fundamental understanding of the Con7 master regulator of pathogenesis by demonstrating its positive regulatory role on the Tox3 effector in P. nodorum through direct interaction. This article is protected by copyright. All rights reserved. © 2018 John Wiley & Sons Ltd.
Zhao, Wenchao; Yang, Xueyong; Yu, Hongjun; Jiang, Weijie; Sun, Na; Liu, Xiaoran; Liu, Xiaolin; Zhang, Xiaomeng; Wang, Yan; Gu, Xingfang
2015-03-01
Nitrogen (N) is both an important macronutrient and a signal for plant growth and development. However, the early regulatory mechanism of plants in response to N starvation is not well understood, especially in cucumber, an economically important crop that normally consumes excessive N during production. In this study, the early time-course transcriptome response of cucumber leaves under N deficiency was monitored using RNA sequencing (RNA-Seq). More than 23,000 transcripts were examined in cucumber leaves, of which 364 genes were differentially expressed in response to N deficiency. Based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, gene ontology (GO) and protein-protein interaction analysis, 64 signaling-related N-deficiency-responsive genes were identified. Furthermore, the potential regulatory mechanisms of anthocyanin accumulation, Chl decline and cell wall remodeling were assessed at the transcription level. Increased ascorbic acid synthesis was identified in cucumber seedlings and fruit under N-deficient conditions, and a new corresponding regulatory hypothesis has been proposed. A data cross-comparison between model plants and cucumber was made, and some common and specific N-deficient response mechanisms were found in the present study. Our study provides novel insights into the responses of cucumber to nitrogen starvation at the global transcriptome level, which are expected to be highly useful for dissecting the N response pathways in this major vegetable and for improving N fertilization practices. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Chen, Yan ping; Pettis, Jeffery S; Zhao, Yan; Liu, Xinyue; Tallon, Luke J; Sadzewicz, Lisa D; Li, Renhua; Zheng, Huoqing; Huang, Shaokang; Zhang, Xuan; Hamilton, Michele C; Pernal, Stephen F; Melathopoulos, Andony P; Yan, Xianghe; Evans, Jay D
2013-07-05
The microsporidia parasite Nosema contributes to the steep global decline of honey bees that are critical pollinators of food crops. There are two species of Nosema that have been found to infect honey bees, Nosema apis and N. ceranae. Genome sequencing of N. apis and comparative genome analysis with N. ceranae, a fully sequenced microsporidia species, reveal novel insights into host-parasite interactions underlying the parasite infections. We applied the whole-genome shotgun sequencing approach to sequence and assemble the genome of N. apis which has an estimated size of 8.5 Mbp. We predicted 2,771 protein- coding genes and predicted the function of each putative protein using the Gene Ontology. The comparative genomic analysis led to identification of 1,356 orthologs that are conserved between the two Nosema species and genes that are unique characteristics of the individual species, thereby providing a list of virulence factors and new genetic tools for studying host-parasite interactions. We also identified a highly abundant motif in the upstream promoter regions of N. apis genes. This motif is also conserved in N. ceranae and other microsporidia species and likely plays a role in gene regulation across the microsporidia. The availability of the N. apis genome sequence is a significant addition to the rapidly expanding body of microsprodian genomic data which has been improving our understanding of eukaryotic genome diversity and evolution in a broad sense. The predicted virulent genes and transcriptional regulatory elements are potential targets for innovative therapeutics to break down the life cycle of the parasite.
SwiSpot: modeling riboswitches by spotting out switching sequences.
Barsacchi, Marco; Novoa, Eva Maria; Kellis, Manolis; Bechini, Alessio
2016-11-01
Riboswitches are cis-regulatory elements in mRNA, mostly found in Bacteria, which exhibit two main secondary structure conformations. Although one of them prevents the gene from being expressed, the other conformation allows its expression, and this switching process is typically driven by the presence of a specific ligand. Although there are a handful of known riboswitches, our knowledge in this field has been greatly limited due to our inability to identify their alternate structures from their sequences. Indeed, current methods are not able to predict the presence of the two functionally distinct conformations just from the knowledge of the plain RNA nucleotide sequence. Whether this would be possible, for which cases, and what prediction accuracy can be achieved, are currently open questions. Here we show that the two alternate secondary structures of riboswitches can be accurately predicted once the 'switching sequence' of the riboswitch has been properly identified. The proposed SwiSpot approach is capable of identifying the switching sequence inside a putative, complete riboswitch sequence, on the basis of pairing behaviors, which are evaluated on proper sets of configurations. Moreover, it is able to model the switching behavior of riboswitches whose generated ensemble covers both alternate configurations. Beyond structural predictions, the approach can also be paired to homology-based riboswitch searches. SwiSpot software, along with the reference dataset files, is available at: http://www.iet.unipi.it/a.bechini/swispot/Supplementary information: Supplementary data are available at Bioinformatics online. a.bechini@ing.unipi.it. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Complete nucleotide sequence and annotation of the temperate corynephage ϕ16 genome.
Lobanova, Juliya S; Gak, Evgueni R; Andreeva, Irina G; Rybak, Konstantin V; Krylov, Alexander A; Mashko, Sergey V
2017-08-01
The complete genome of ϕ16, a temperate corynephage from Corynebacterium glutamicum ATCC 21792, was sequenced and annotated (GenBank: KY250482). The electron microscopy study of ϕ16 virion confirmed that it belongs to the family Siphoviridae. The ϕ16 genome consists of a linear double-stranded DNA molecule of 58,200 bp (G+C = 52.2%) with protruding cohesive 3'-ends of 14 nt. Four major structural proteins were separated by SDS-PAGE and identified by peptide mass fingerprinting technique. Using bioinformatics analysis, 101 putative ORFs and 5 tRNA genes were predicted. Only 27 putative gene products could be assigned to known biological functions. The ϕ16 genome was divided into functional modules. Seven putative promoters and eight putative unidirectional intrinsic terminators were predicted. One site of putative «-1» programmed ribosomal frameshifting was proposed in the phage tail assembly genome region. C. glutamicum genetic tools could be broadened by exploiting the known integrase gene (gp33) and the newly identified excisionase gene (gp47), participating in site-specific recombination between ϕ16-attP/attB.
de Castro, Minique Hilda; de Klerk, Daniel; Pienaar, Ronel; Rees, D Jasper G; Mans, Ben J
2017-08-10
Ticks secrete a diverse mixture of secretory proteins into the host to evade its immune response and facilitate blood-feeding, making secretory proteins attractive targets for the production of recombinant anti-tick vaccines. The largely neglected tick species, Rhipicephalus zambeziensis, is an efficient vector of Theileria parva in southern Africa but its available sequence information is limited. Next generation sequencing has advanced sequence availability for ticks in recent years and has assisted the characterisation of secretory proteins. This study focused on the de novo assembly and annotation of the salivary gland transcriptome of R. zambeziensis and the temporal expression of secretory protein transcripts in female and male ticks, before the onset of feeding and during early and late feeding. The sialotranscriptome of R. zambeziensis yielded 23,631 transcripts from which 13,584 non-redundant proteins were predicted. Eighty-six percent of these contained a predicted start and stop codon and were estimated to be putatively full-length proteins. A fifth (2569) of the predicted proteins were annotated as putative secretory proteins and explained 52% of the expression in the transcriptome. Expression analyses revealed that 2832 transcripts were differentially expressed among feeding time points and 1209 between the tick sexes. The expression analyses further indicated that 57% of the annotated secretory protein transcripts were differentially expressed. Dynamic expression profiles of secretory protein transcripts were observed during feeding of female ticks. Whereby a number of transcripts were upregulated during early feeding, presumably for feeding site establishment and then during late feeding, 52% of these were downregulated, indicating that transcripts were required at specific feeding stages. This suggested that secretory proteins are under stringent transcriptional regulation that fine-tunes their expression in salivary glands during feeding. No open reading frames were predicted for 7947 transcripts. This class represented 17% of the differentially expressed transcripts, suggesting a potential transcriptional regulatory function of long non-coding RNA in tick blood-feeding. The assembled sialotranscriptome greatly expands the sequence availability of R. zambeziensis, assists in our understanding of the transcription of secretory proteins during blood-feeding and will be a valuable resource for future vaccine candidate selection.
Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick
2009-06-01
Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.
Dover, Nir; Barash, Jason R.; Burke, Julianne N.; ...
2014-05-22
Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bontmore » gene that is part of a toxin gene cluster that includes several accessory genes. In this paper, we sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ 70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. Finally, this TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.« less
Wang, Min; Hancock, Timothy P; Chamberlain, Amanda J; Vander Jagt, Christy J; Pryce, Jennie E; Cocks, Benjamin G; Goddard, Mike E; Hayes, Benjamin J
2018-05-24
Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants. We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001). Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.
Li, Fu-Gui; Chen, Jie; Jiang, Xia-Yun; Zou, Shu-Ming
2015-01-01
The blunt snout bream (Megalobrama amblycephala) is an important freshwater aquaculture species, but it is sensitive to hypoxia. No transcriptome data related to growth and hypoxia response are available for this species. In this study, we performed de novo transcriptome sequencing for the liver and gills of the fast-growth family and slow-growth family derived from ‘Pujiang No.1’ F10 blunt snout bream that were under hypoxic stress and normoxia, respectively. The fish were divided into the following 4 groups: fast-growth family under hypoxic stress, FH; slow-growth family under hypoxic stress, SH; fast-growth family under normoxia, FN; and slow-growth family under normoxia, SN. A total of 185 million high-quality reads were obtained from the normalized cDNA of the pooled samples, which were assembled into 465,582 contigs and 237,172 transcripts. A total of 31,338 transcripts from the same locus (unigenes) were annotated and assigned to 104 functional groups, and 23,103 unigenes were classified into seven main categories, including 45 secondary KEGG pathways. A total of 22,255 (71%) known putative unigenes were found to be shared across the genomes of five model fish species and mammals, and a substantial number (9.4%) of potentially novel genes were identified. When 6,639 unigenes were used in the analysis of differential expression (DE) genes, the number of putative DE genes related to growth pathways in FH, SH, SN and FN was 159, 118, 92 and 65 in both the liver and gills, respectively, and the number of DE genes related to hypoxic response was 57, 33, 23 and 21 in FH, FN, SH and SN, respectively. Our results suggest that growth performance of the fast-growth family should be due to complex mutual gene regulatory mechanisms of these putative DE genes between growth and hypoxia. PMID:26554582
Han, Wei; Zou, Jianmin; Wang, Kehua; Su, Yijun; Zhu, Yunfen; Song, Chi; Li, Guohui; Qu, Liang; Zhang, Huiyong; Liu, Honglin
2015-01-01
Onset of the rapid gonad growth is a milestone in sexual development that comprises many genes and regulatory factors. The observations in model organisms and mammals including humans have shown a potential link between miRNAs and development timing. To determine whether miRNAs play roles in this process in the chicken (Gallus gallus), the Solexa deep sequencing was performed to analyze the profiles of miRNA expression in the hypothalamus of hens from two different pubertal stages, before onset of the rapid gonad development (BO) and after onset of the rapid gonad development (AO). 374 conserved and 46 novel miRNAs were identified as hypothalamus-expressed miRNAs in the chicken. 144 conserved miRNAs were showed to be differentially expressed (reads > 10, P < 0.05) during the transition from BO to AO. Five differentially expressed miRNAs were validated by real-time quantitative RT-PCR (qRT-PCR) method. 2013 putative genes were predicted as the targets of the 15 most differentially expressed miRNAs (fold-change > 4.0, P < 0.01). Of these genes, 7 putative circadian clock genes, Per2, Bmal1/2, Clock, Cry1/2, and Star were found to be targeted multiple times by the miRNAs. qRT-PCR revealed the basic transcription levels of these clock genes were much higher (P < 0.01) in AO than in BO. Further functional analysis suggested that these 15 miRNAs play important roles in transcriptional regulation and signal transduction pathways. The results provide new insights into miRNAs functions in timing the rapid development of chicken gonads. Considering the characteristics of miRNA functional conservation, the results will contribute to the research on puberty onset in humans.
Template Based Design of Anti-Metastatic Drugs from the Active Conformation of Laminin Peptide II
2001-01-01
p40 (LBP/p40) gene Maeda, M., Kawasaki, K., Mu, Y., Kamada, H., during sea urchin development. Exp. Cell Res. 221, Tsutsumi, Y., Smith, T. J. & Mayumi...represents the average of six replicates + SEM . minance of putative heparin-binding phage recov- ered from elution with peptide 11. Putative heparin...scrambled sequence peptide, WAQADSTPE, was used as a sequence specificity control. The data shown is the average of six replicate wells ± SEM . Statistics were
Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E
2009-11-25
To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts. The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.
O’Keeffe, Triona; Hill, Colin; Ross, R. Paul
1999-01-01
Enterocin A is a small, heat-stable, antilisterial bacteriocin produced by Enterococcus faecium DPC1146. The sequence of a 10,879-bp chromosomal region containing at least 12 open reading frames (ORFs), 7 of which are predicted to play a role in enterocin biosynthesis, is presented. The genes entA, entI, and entF encode the enterocin A prepeptide, the putative immunity protein, and the induction factor prepeptide, respectively. The deduced proteins EntK and EntR resemble the histidine kinase and response regulator proteins of two-component signal transducing systems of the AgrC-AgrA type. The predicted proteins EntT and EntD are homologous to ABC (ATP-binding cassette) transporters and accessory factors, respectively, of several other bacteriocin systems and to proteins implicated in the signal-sequence-independent export of Escherichia coli hemolysin A. Immediately downstream of the entT and entD genes are two ORFs, the product of one of which, ORF4, is very similar to the product of the yteI gene of Bacillus subtilis and to E. coli protease IV, a signal peptide peptidase known to be involved in outer membrane lipoprotein export. Another potential bacteriocin is encoded in the opposite direction to the other genes in the enterocin cluster. This putative bacteriocin-like peptide is similar to LafX, one of the components of the lactacin F complex. A deletion which included one of two direct repeats upstream of the entA gene abolished enterocin A activity, immunity, and ability to induce bacteriocin production. Transposon insertion upstream of the entF gene also had the same effect, but this mutant could be complemented by exogenously supplied induction factor. The putative EntI peptide was shown to be involved in the immunity to enterocin A. Cloning of a 10.5-kb amplicon comprising all predicted ORFs and regulatory regions resulted in heterologous production of enterocin A and induction factor in Enterococcus faecalis, while a four-gene construct (entAITD) under the control of a constitutive promoter resulted in heterologous enterocin A production in both E. faecalis and Lactococcus lactis. PMID:10103244
Identification of flowering genes in strawberry, a perennial SD plant
Mouhu, Katriina; Hytönen, Timo; Folta, Kevin; Rantanen, Marja; Paulin, Lars; Auvinen, Petri; Elomaa, Paula
2009-01-01
Background We are studying the regulation of flowering in perennial plants by using diploid wild strawberry (Fragaria vesca L.) as a model. Wild strawberry is a facultative short-day plant with an obligatory short-day requirement at temperatures above 15°C. At lower temperatures, however, flowering induction occurs irrespective of photoperiod. In addition to short-day genotypes, everbearing forms of wild strawberry are known. In 'Baron Solemacher' recessive alleles of an unknown repressor, SEASONAL FLOWERING LOCUS (SFL), are responsible for continuous flowering habit. Although flower induction has a central effect on the cropping potential, the molecular control of flowering in strawberries has not been studied and the genetic flowering pathways are still poorly understood. The comparison of everbearing and short-day genotypes of wild strawberry could facilitate our understanding of fundamental molecular mechanisms regulating perennial growth cycle in plants. Results We have searched homologs for 118 Arabidopsis flowering time genes from Fragaria by EST sequencing and bioinformatics analysis and identified 66 gene homologs that by sequence similarity, putatively correspond to genes of all known genetic flowering pathways. The expression analysis of 25 selected genes representing various flowering pathways did not reveal large differences between the everbearing and the short-day genotypes. However, putative floral identity and floral integrator genes AP1 and LFY were co-regulated during early floral development. AP1 mRNA was specifically accumulating in the shoot apices of the everbearing genotype, indicating its usability as a marker for floral initiation. Moreover, we showed that flowering induction in everbearing 'Baron Solemacher' and 'Hawaii-4' was inhibited by short-day and low temperature, in contrast to short-day genotypes. Conclusion We have shown that many central genetic components of the flowering pathways in Arabidopsis can be identified from strawberry. However, novel regulatory mechanisms exist, like SFL that functions as a switch between short-day/low temperature and long-day/high temperature flowering responses between the short-day genotype and the everbearing 'Baron Solemacher'. The identification of putative flowering gene homologs and AP1 as potential marker gene for floral initiation will strongly facilitate the exploration of strawberry flowering pathways. PMID:19785732
G = MAT: linking transcription factor expression and DNA binding data.
Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak
2011-01-31
Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.
G = MAT: Linking Transcription Factor Expression and DNA Binding Data
Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak
2011-01-01
Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/. PMID:21297945
Caste development and reproduction: a genome-wide analysis of hallmarks of insect eusociality
Cristino, A S; Nunes, F M F; Lobo, C H; Bitondi, M M G; Simões, Z L P; Da Fontoura Costa, L; Lattorff, H M G; Moritz, R F A; Evans, J D; Hartfelder, K
2006-01-01
The honey bee queen and worker castes are a model system for developmental plasticity. We used established expressed sequence tag information for a Gene Ontology based annotation of genes that are differentially expressed during caste development. Metabolic regulation emerged as a major theme, with a caste-specific difference in the expression of oxidoreductases vs. hydrolases. Motif searches in upstream regions revealed group-specific motifs, providing an entry point to cis-regulatory network studies on caste genes. For genes putatively involved in reproduction, meiosis-associated factors came out as highly conserved, whereas some determinants of embryonic axes either do not have clear orthologs (bag of marbles, gurken, torso), or appear to be lacking (trunk) in the bee genome. Our results are the outcome of a first genome-based initiative to provide an annotated framework for trends in gene regulation during female caste differentiation (representing developmental plasticity) and reproduction. PMID:17069641
Legault, Boris A; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R Thane
2006-01-01
Background Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities. PMID:16820057
Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng
2014-06-04
Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases. Phylogenetic analysis using Bayes approach provided support for inferring functional divergence among regulatory cysteine and serine proteases. Numerous putative proteases were identified for the first time in T. solium, and important regulatory proteases have been predicted. This comprehensive analysis not only complements the growing knowledge base of proteolytic enzymes, but also provides a platform from which to expand knowledge of cestode proteases and to explore their biochemistry and potential as intervention targets.
Tao, Wenjing; Sun, Lina; Shi, Hongjuan; Cheng, Yunying; Jiang, Dongneng; Fu, Beide; Conte, Matthew A; Gammerdinger, William J; Kocher, Thomas D; Wang, Deshou
2016-05-04
MicroRNAs (miRNAs) represent a second regulatory network that has important effects on gene expression and protein translation during biological process. However, the possible role of miRNAs in the early stages of fish sex differentiation is not well understood. In this study, we carried an integrated analysis of miRNA and mRNA expression profiles to explore their possibly regulatory patterns at the critical stage of sex differentiation in tilapia. We identified 279 pre-miRNA genes in tilapia genome, which were highly conserved in other fish species. Based on small RNA library sequencing, we identified 635 mature miRNAs in tilapia gonads, in which 62 and 49 miRNAs showed higher expression in XX and XY gonads, respectively. The predicted targets of these sex-biased miRNAs (e.g., miR-9, miR-21, miR-30a, miR-96, miR-200b, miR-212 and miR-7977) included genes encoding key enzymes in steroidogenic pathways (Cyp11a1, Hsd3b, Cyp19a1a, Hsd11b) and key molecules involved in vertebrate sex differentiation (Foxl2, Amh, Star1, Sf1, Dmrt1, and Gsdf). These genes also showed sex-biased expression in tilapia gonads at 5 dah. Some miRNAs (e.g., miR-96 and miR-737) targeted multiple genes involved in steroid synthesis, suggesting a complex miRNA regulatory network during early sex differentiation in this fish. The sequence and expression patterns of most miRNAs in tilapia are conserved in fishes, indicating the basic functions of vertebrate miRNAs might share a common evolutionary origin. This comprehensive analysis of miRNA and mRNA at the early stage of molecular sex differentiation in tilapia XX and XY gonads lead to the discovery of differentially expressed miRNAs and their putative targets, which will facilitate studies of the regulatory network of molecular sex determination and differentiation in fishes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mangelsen, Elke; Kilian, Joachim; Berendzen, Kenneth W.
2008-02-01
WRKY proteins belong to the WRKY-GCM1 superfamily of zinc finger transcription factors that have been subject to a large plant-specific diversification. For the cereal crop barley (Hordeum vulgare), three different WRKY proteins have been characterized so far, as regulators in sucrose signaling, in pathogen defense, and in response to cold and drought, respectively. However, their phylogenetic relationship remained unresolved. In this study, we used the available sequence information to identify a minimum number of 45 barley WRKY transcription factor (HvWRKY) genes. According to their structural features the HvWRKY factors were classified into the previously defined polyphyletic WRKY subgroups 1 tomore » 3. Furthermore, we could assign putative orthologs of the HvWRKY proteins in Arabidopsis and rice. While in most cases clades of orthologous proteins were formed within each group or subgroup, other clades were composed of paralogous proteins for the grasses and Arabidopsis only, which is indicative of specific gene radiation events. To gain insight into their putative functions, we examined expression profiles of WRKY genes from publicly available microarray data resources and found group specific expression patterns. While putative orthologs of the HvWRKY transcription factors have been inferred from phylogenetic sequence analysis, we performed a comparative expression analysis of WRKY genes in Arabidopsis and barley. Indeed, highly correlative expression profiles were found between some of the putative orthologs. HvWRKY genes have not only undergone radiation in monocot or dicot species, but exhibit evolutionary traits specific to grasses. HvWRKY proteins exhibited not only sequence similarities between orthologs with Arabidopsis, but also relatedness in their expression patterns. This correlative expression is indicative for a putative conserved function of related WRKY proteins in mono- and dicot species.« less
McBride, David J.; Buckle, Adam; van Heyningen, Veronica; Kleinjan, Dirk A.
2011-01-01
The PAX6 gene plays a crucial role in development of the eye, brain, olfactory system and endocrine pancreas. Consistent with its pleiotropic role the gene exhibits a complex developmental expression pattern which is subject to strict spatial, temporal and quantitative regulation. Control of expression depends on a large array of cis-elements residing in an extended genomic domain around the coding region of the gene. The minimal essential region required for proper regulation of this complex locus has been defined through analysis of human aniridia-associated breakpoints and YAC transgenic rescue studies of the mouse smalleye mutant. We have carried out a systematic DNase I hypersensitive site (HS) analysis across 200 kb of this critical region of mouse chromosome 2E3 to identify putative regulatory elements. Mapping the identified HSs onto a percent identity plot (PIP) shows many HSs correspond to recognisable genomic features such as evolutionarily conserved sequences, CpG islands and retrotransposon derived repeats. We then focussed on a region previously shown to contain essential long range cis-regulatory information, the Pax6 downstream regulatory region (DRR), allowing comparison of mouse HS data with previous human HS data for this region. Reporter transgenic mice for two of the HS sites, HS5 and HS6, show that they function as tissue specific regulatory elements. In addition we have characterised enhancer activity of an ultra-conserved cis-regulatory region located near Pax6, termed E60. All three cis-elements exhibit multiple spatio-temporal activities in the embryo that overlap between themselves and other elements in the locus. Using a deletion set of YAC reporter transgenic mice we demonstrate functional interdependence of the elements. Finally, we use the HS6 enhancer as a marker for the migration of precerebellar neuro-epithelium cells to the hindbrain precerebellar nuclei along the posterior and anterior extramural streams allowing visualisation of migratory defects in both pathways in Pax6Sey/Sey mice. PMID:22220192
Xuxia, Wang; Jie, Chen; Bo, Wang; Lijun, Liu; Hui, Jiang; Diluo, Tang; Dingxiang, Peng
2012-01-01
For the purpose of screening putative anthracnose resistance-related genes of ramie ( Boehmeria nivea L. Gaud), a cDNA library was constructed by suppression subtractive hybridization using anthracnose-resistant cultivar Huazhu no. 4. The cDNAs from Huazhu no. 4, which were infected with Colletotrichum gloeosporioides , were used as the tester and cDNAs from uninfected Huazhu no. 4 as the driver. Sequencing analysis and homology searching showed that these clones represented 132 single genes, which were assigned to functional categories, including 14 putative cellular functions, according to categories established for Arabidopsis . These 132 genes included 35 disease resistance and stress tolerance-related genes including putative heat-shock protein 90, metallothionein, PR-1.2 protein, catalase gene, WRKY family genes, and proteinase inhibitor-like protein. Partial disease-related genes were further analyzed by reverse transcription PCR and RNA gel blot. These expressed sequence tags are the first anthracnose resistance-related expressed sequence tags reported in ramie.
Putative Porin of Bradyrhizobium sp. (Lupinus) Bacteroids Induced by Glyphosate▿
de María, Nuria; Guevara, Ángeles; Serra, M. Teresa; García-Luque, Isabel; González-Sama, Alfonso; de Lacoba, Mario García; de Felipe, M. Rosario; Fernández-Pascual, Mercedes
2007-01-01
Application of glyphosate (N-[phosphonomethyl] glycine) to Bradyrhizobium sp. (Lupinus)-nodulated lupin plants caused modifications in the protein pattern of bacteroids. The most significant change was the presence of a 44-kDa polypeptide in bacteroids from plants treated with the higher doses of glyphosate employed (5 and 10 mM). The polypeptide has been characterized by the amino acid sequencing of its N terminus and the isolation and nucleic acid sequencing of its encoding gene. It is putatively encoded by a single gene, and the protein has been identified as a putative porin. Protein modeling revealed the existence of several domains sharing similarity to different porins, such as a transmembrane beta-barrel. The protein has been designated BLpp, for Bradyrhizobium sp. (Lupinus) putative porin, and would be the first porin described in Bradyrhizobium sp. (Lupinus). In addition, a putative conserved domain of porins has been identified which consists of 87 amino acids, located in the BLpp sequence 30 amino acids downstream of the N-terminal region. In bacteroids, mRNA of the BLpp gene shows a basal constitutive expression that increases under glyphosate treatment, and the expression of the gene is seemingly regulated at the transcriptional level. By contrast, in free-living bacteria glyphosate treatment leads to an inhibition of BLpp mRNA accumulation, indicating a different effect of glyphosate on BLpp gene expression in bacteroids and free-living bacteria. The possible role of BLpp in a metabolite interchange between Bradyrhizobium and lupin is discussed. PMID:17557843
Han, Xiaolong; Chakrabortti, Alolika; Zhu, Jindong; Liang, Zhao-Xun; Li, Jinming
2016-08-15
Aspergillus westerdijkiae produces ochratoxin A (OTA) in Aspergillus section Circumdati. It is responsible for the contamination of agricultural crops, fruits, and food commodities, as its secondary metabolite OTA poses a potential threat to animals and humans. As a member of the filamentous fungi family, its capacity for enzymatic catalysis and secondary metabolite production is valuable in industrial production and medicine. To understand the genetic factors underlying its pathogenicity, enzymatic degradation, and secondary metabolism, we analysed the whole genome of A. westerdijkiae and compared it with eight other sequenced Aspergillus species. We sequenced the complete genome of A. westerdijkiae and assembled approximately 36 Mb of its genomic DNA, in which we identified 10,861 putative protein-coding genes. We constructed a phylogenetic tree of A. westerdijkiae and eight other sequenced Aspergillus species and found that the sister group of A. westerdijkiae was the A. oryzae - A. flavus clade. By searching the associated databases, we identified 716 cytochrome P450 enzymes, 633 carbohydrate-active enzymes, and 377 proteases. By combining comparative analysis with Kyoto Encyclopaedia of Genes and Genomes (KEGG), Conserved Domains Database (CDD), and Pfam annotations, we predicted 228 potential carbohydrate-active enzymes related to plant polysaccharide degradation (PPD). We found a large number of secondary biosynthetic gene clusters, which suggested that A. westerdijkiae had a remarkable capacity to produce secondary metabolites. Furthermore, we obtained two more reliable and integrated gene sequences containing the reported portions of OTA biosynthesis and identified their respective secondary metabolite clusters. We also systematically annotated these two hybrid t1pks-nrps gene clusters involved in OTA biosynthesis. These two clusters were separate in the genome, and one of them encoded a couple of GH3 and AA3 enzyme genes involved in sucrose and glucose metabolism. The genomic information obtained in this study is valuable for understanding the life cycle and pathogenicity of A. westerdijkiae. We identified numerous enzyme genes that are potentially involved in host invasion and pathogenicity, and we provided a preliminary prediction for each putative secondary metabolite (SM) gene cluster. In particular, for the OTA-related SM gene clusters, we delivered their components with domain and pathway annotations. This study sets the stage for experimental verification of the biosynthetic and regulatory mechanisms of OTA and for the discovery of new secondary metabolites.
Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H
2005-01-01
Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476
King, Lanikea B.; Walum, Hasse; Inoue, Kiyoshi; Eyrich, Nicholas W.; Young, Larry J.
2015-01-01
Background Oxytocin (OXT) modulates several aspects of social behavior. Intranasal OXT is a leading candidate for treating social deficits in autism spectrum disorder (ASD) and common genetic variants in the human oxytocin receptor (OXTR) are associated with emotion recognition, relationship quality and ASD. Animal models have revealed that individual differences in Oxtr expression in the brain drive social behavior variation. Our understanding of how genetic variation contributes to brain OXTR expression is very limited. Methods We investigated Oxtr expression in monogamous prairie voles, which have a well characterized OXT system. We quantified brain region-specific levels of Oxtr mRNA and OXTR protein with established neuroanatomical methods. We used pyrosequencing to investigate allelic imbalance of Oxtr mRNA, a molecular signature of polymorphic genetic regulatory elements. We performed next-generation sequencing to discover variants in and near the Oxtr gene. We investigated social attachment using the partner preference test. Results Our allelic imbalance data demonstrates that genetic variants contribute to individual differences in Oxtr expression, but only in particular brain regions, including the nucleus accumbens (NAcc), where OXTR signaling facilitates social attachment. Next-generation sequencing identified one polymorphism in the Oxtr intron, near a putative cis-regulatory element, explaining 74% of the variance in striatal Oxtr expression specifically. Males homozygous for the high expressing allele display enhanced social attachment. Discussion Taken together, these findings provide convincing evidence for robust genetic influence on Oxtr expression and provide novel insights into how non-coding polymorphisms in the OXTR might influence individual differences in human social cognition and behavior PMID:26893121
Howden, Benjamin P.; McEvoy, Christopher R. E.; Allen, David L.; Chua, Kyra; Gao, Wei; Harrison, Paul F.; Bell, Jan; Coombs, Geoffrey; Bennett-Wood, Vicki; Porter, Jessica L.; Robins-Browne, Roy; Davies, John K.; Seemann, Torsten; Stinear, Timothy P.
2011-01-01
Antimicrobial resistance in Staphylococcus aureus is a major public health threat, compounded by emergence of strains with resistance to vancomycin and daptomycin, both last line antimicrobials. Here we have performed high throughput DNA sequencing and comparative genomics for five clinical pairs of vancomycin-susceptible (VSSA) and vancomycin-intermediate ST239 S. aureus (VISA); each pair isolated before and after vancomycin treatment failure. These comparisons revealed a frequent pattern of mutation among the VISA strains within the essential walKR two-component regulatory locus involved in control of cell wall metabolism. We then conducted bi-directional allelic exchange experiments in our clinical VSSA and VISA strains and showed that single nucleotide substitutions within either walK or walR lead to co-resistance to vancomycin and daptomycin, and caused the typical cell wall thickening observed in resistant clinical isolates. Ion Torrent genome sequencing confirmed no additional regulatory mutations had been introduced into either the walR or walK VISA mutants during the allelic exchange process. However, two potential compensatory mutations were detected within putative transport genes for the walK mutant. The minimal genetic changes in either walK or walR also attenuated virulence, reduced biofilm formation, and led to consistent transcriptional changes that suggest an important role for this regulator in control of central metabolism. This study highlights the dramatic impacts of single mutations that arise during persistent S. aureus infections and demonstrates the role played by walKR to increase drug resistance, control metabolism and alter the virulence potential of this pathogen. PMID:22102812
DArT Markers Effectively Target Gene Space in the Rye Genome
Gawroński, Piotr; Pawełkowicz, Magdalena; Tofil, Katarzyna; Uszyński, Grzegorz; Sharifova, Saida; Ahluwalia, Shivaksh; Tyrka, Mirosław; Wędzony, Maria; Kilian, Andrzej; Bolibok-Brągoszewska, Hanna
2016-01-01
Large genome size and complexity hamper considerably the genomics research in relevant species. Rye (Secale cereale L.) has one of the largest genomes among cereal crops and repetitive sequences account for over 90% of its length. Diversity Arrays Technology is a high-throughput genotyping method, in which a preferential sampling of gene-rich regions is achieved through the use of methylation sensitive restriction enzymes. We obtained sequences of 6,177 rye DArT markers and following a redundancy analysis assembled them into 3,737 non-redundant sequences, which were then used in homology searches against five Pooideae sequence sets. In total 515 DArT sequences could be incorporated into publicly available rye genome zippers providing a starting point for the integration of DArT- and transcript-based genomics resources in rye. Using Blast2Go pipeline we attributed putative gene functions to 1101 (29.4%) of the non-redundant DArT marker sequences, including 132 sequences with putative disease resistance-related functions, which were found to be preferentially located in the 4RL and 6RL chromosomes. Comparative analysis based on the DArT sequences revealed obvious inconsistencies between two recently published high density consensus maps of rye. Furthermore we demonstrated that DArT marker sequences can be a source of SSR polymorphisms. Obtained data demonstrate that DArT markers effectively target gene space in the large, complex, and repetitive rye genome. Through the annotation of putative gene functions and the alignment of DArT sequences relative to reference genomes we obtained information, that will complement the results of the studies, where DArT genotyping was deployed, by simplifying the gene ontology and microcolinearity based identification of candidate genes. PMID:27833625
DArT Markers Effectively Target Gene Space in the Rye Genome.
Gawroński, Piotr; Pawełkowicz, Magdalena; Tofil, Katarzyna; Uszyński, Grzegorz; Sharifova, Saida; Ahluwalia, Shivaksh; Tyrka, Mirosław; Wędzony, Maria; Kilian, Andrzej; Bolibok-Brągoszewska, Hanna
2016-01-01
Large genome size and complexity hamper considerably the genomics research in relevant species. Rye ( Secale cereale L.) has one of the largest genomes among cereal crops and repetitive sequences account for over 90% of its length. Diversity Arrays Technology is a high-throughput genotyping method, in which a preferential sampling of gene-rich regions is achieved through the use of methylation sensitive restriction enzymes. We obtained sequences of 6,177 rye DArT markers and following a redundancy analysis assembled them into 3,737 non-redundant sequences, which were then used in homology searches against five Pooideae sequence sets. In total 515 DArT sequences could be incorporated into publicly available rye genome zippers providing a starting point for the integration of DArT- and transcript-based genomics resources in rye. Using Blast2Go pipeline we attributed putative gene functions to 1101 (29.4%) of the non-redundant DArT marker sequences, including 132 sequences with putative disease resistance-related functions, which were found to be preferentially located in the 4RL and 6RL chromosomes. Comparative analysis based on the DArT sequences revealed obvious inconsistencies between two recently published high density consensus maps of rye. Furthermore we demonstrated that DArT marker sequences can be a source of SSR polymorphisms. Obtained data demonstrate that DArT markers effectively target gene space in the large, complex, and repetitive rye genome. Through the annotation of putative gene functions and the alignment of DArT sequences relative to reference genomes we obtained information, that will complement the results of the studies, where DArT genotyping was deployed, by simplifying the gene ontology and microcolinearity based identification of candidate genes.
Romay, Gustavo; Chirinos, Dorys T; Geraud-Pouey, Francis; Gillis, Annika; Mahillon, Jacques; Bragard, Claude
2018-02-01
At least six begomovirus species have been reported infecting tomato in Venezuela. In this study the complete genomes of two tomato-infecting begomovirus isolates (referred to as Trujillo-427 and Zulia-1084) were cloned and sequenced. Both isolates showed the typical genome organization of New World bipartite begomoviruses, with DNA-A genomic components displaying 88.8% and 90.3% similarity with established begomoviruses, for isolates Trujillo-427 and Zulia-1084, respectively. In accordance to the guidelines for begomovirus species demarcation, the Trujillo-427 isolate represents a putative new species and the name "Tomato wrinkled mosaic virus" is proposed. Meanwhile, Zulia-1084 represents a putative new strain classifiable within species Tomato chlorotic leaf distortion virus, for which a recombinant origin is suggested.
Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng
2017-10-01
The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Gamo, F J; Lafuente, M J; Casamayor, A; Ariño, J; Aldea, M; Casas, C; Herrero, E; Gancedo, C
1996-06-15
We report the sequence of a 15.5 kb DNA segment located near the left telomere of chromosome XV of Saccharomyces cerevisiae. The sequence contains nine open reading frames (ORFs) longer than 300 bp. Three of them are internal to other ones. One corresponds to the gene LGT3 that encodes a putative sugar transporter. Three adjacent ORFs were separated by two stop codons in frame. These ORFs presented homology with the gene CPS1 that encodes carboxypeptidase S. The stop codons were not found in the same sequence derived from another yeast strain. Two other ORFs without significant homology in databases were also found. One of them, O0420, is very rich in serine and threonine and presents a series of repeated or similar amino acid stretches along the sequence.
Mapping of aldose reductase gene sequences to human chromosomes 1, 3, 7, 9, 11, and 13
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bateman, J.B.; Kojis, T.; Heinzmann, C.
1993-09-01
Aldose reductase (alditol:NAD(P)+ 1-oxidoreductase; EC 1.1.1.21) (AR) catalyzes the reduction of several aldehydes, including that of glucose, to the corresponding sugar alcohol. Using a complementary DNA clone encoding human AR, the authors mapped the gene sequences to human chromosomes 1, 3, 7, 9, 11, 13, 14, and 18 by somatic cell hybridization. By in situ hybridization analysis, sequences were localized to human chromosomes 1q32-q43, 3p12, 7q31-q35, 9q22, 11p14-p15, and 13q14-q21. As a putative functional AR gene has been mapped to chromosome 7 and a putative pseudogene to chromosome 3, the sequences on the other seven chromosomes may represent other activemore » genes, non-aldose reductase homologous sequences, or pseudogenes. 24 refs., 3 figs., 2 tabs.« less
Stone, David M; Kerr, Rose C; Hughes, Margaret; Radford, Alan D; Darby, Alistair C
2013-11-01
The complete coding sequences were determined for four putative vesiculoviruses isolated from fish. Sequence alignment and phylogenetic analysis based on the predicted amino acid sequences of the five main proteins assigned tench rhabdovirus and grass carp rhabdovirus together with spring viraemia of carp and pike fry rhabdovirus to a lineage that was distinct from the mammalian vesiculoviruses. Perch rhabdovirus, eel virus European X, lake trout rhabdovirus 903/87 and sea trout virus were placed in a second lineage that was also distinct from the recognised genera in the family Rhabdoviridae. Establishment of two new rhabdovirus genera, "Perhabdovirus" and "Sprivivirus", is discussed.
Biotype-specific tcpA genes in Vibrio cholerae.
Iredell, J R; Manning, P A
1994-08-01
The tcpA gene, encoding the structural subunit of the toxin-coregulated pilus, has been isolated from a variety of clinical isolates of Vibrio cholerae, and the nucleotide sequence determined. Strict biotype-specific conservation within both the coding and putative regulatory regions was observed, with important differences between the El Tor and classical biotypes. V. cholerae O139 Bengal strains appear to have El Tor-type tcpA genes. Environmental O1 and non-O1 isolates have sequences that bind an El Tor-specific tcpA DNA probe and that are weakly and variably amplified by tcpA-specific polymerase chain reaction primers, under conditions of reduced stringency. The data presented allow the selection of primer pairs to help distinguish between clinical and environmental isolates, and to distinguish El Tor (and Bengal) biotypes from classical biotypes of V. cholerae. While the role of TcpA in cholera vaccine preparations remains unclear, the data strongly suggest that TcpA-containing vaccines directed at O1 strains need include only the two forms of TcpA, and that such vaccines directed at (O139) Bengal strains should include the TcpA of El Tor biotype.
Comparative analysis of prophages in Streptococcus mutans genomes
Fu, Tiwei; Fan, Xiangyu; Long, Quanxin; Deng, Wanyan; Song, Jinlin
2017-01-01
Prophages have been considered genetic units that have an intimate association with novel phenotypic properties of bacterial hosts, such as pathogenicity and genomic variation. Little is known about the genetic information of prophages in the genome of Streptococcus mutans, a major pathogen of human dental caries. In this study, we identified 35 prophage-like elements in S. mutans genomes and performed a comparative genomic analysis. Comparative genomic and phylogenetic analyses of prophage sequences revealed that the prophages could be classified into three main large clusters: Cluster A, Cluster B, and Cluster C. The S. mutans prophages in each cluster were compared. The genomic sequences of phismuN66-1, phismuNLML9-1, and phismu24-1 all shared similarities with the previously reported S. mutans phages M102, M102AD, and ϕAPCM01. The genomes were organized into seven major gene clusters according to the putative functions of the predicted open reading frames: packaging and structural modules, integrase, host lysis modules, DNA replication/recombination modules, transcriptional regulatory modules, other protein modules, and hypothetical protein modules. Moreover, an integrase gene was only identified in phismuNLML9-1 prophages. PMID:29158986
GBshape: a genome browser database for DNA shape annotations
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J.; Parker, Stephen C.J.; Nuzhdin, Sergey V.; Tullius, Thomas D.; Rohs, Remo
2015-01-01
Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. PMID:25326329
Gbadegesin, M A; Beeching, J R
2011-06-07
Cassava can be cultivated on impoverished soils with minimum inputs, and its storage roots are a staple food for millions in Africa. However, these roots are low in bioavailable nutrients and in protein content, contain cyanogenic glycosides, and suffer from a very short post-harvest shelf-life, and the plant is susceptible to viral and bacterial diseases prevalent in Africa. The demand for improvement of cassava with respect to these traits comes from both farmers and national agricultural institutions. Genetic improvement of cassava cultivars by molecular biology techniques requires the availability of appropriate genes, a system to introduce these genes into cassava, and the use of suitable gene promoters. Cassava root-specific promoter for auxin-repressed protein was isolated using the gene walking approach, starting with a cDNA sequence. In silico analysis of promoter sequences revealed putative cis-acting regulatory elements, including root-specific elements, which may be required for gene expression in vascular tissues. Research on the activities of this promoter is continuing, with the development of plant expression cassettes for transformation into major African elite lines and farmers' preferred cassava cultivars to enable testing of tissue-specific expression patterns in the field.
Alu-derived cis-element regulates tumorigenesis-dependent gastric expression of GASDERMIN B (GSDMB).
Komiyama, Hiromitsu; Aoki, Aya; Tanaka, Shigekazu; Maekawa, Hiroshi; Kato, Yoriko; Wada, Ryo; Maekawa, Takeo; Tamura, Masaru; Shiroishi, Toshihiko
2010-02-01
GASDERMIN B (GSDMB) belongs to the novel gene family GASDERMIN (GSDM). All GSDM family members are located in amplicons, genomic regions often amplified during cancer development. Given that GSDMB is highly expressed in cancerous cells and the locus resides in an amplicon, GSDMB may be involved in cancer development and/or progression. However, only limited information is available on GSDMB expression in tissues, normal and cancerous, from cancer patients. Furthermore, the molecular mechanisms that regulate GSDMB expression in gastric tissues are poorly understood. We investigated the spatiotemporal expression patterns of GSDMB in gastric cancer patients and the 5' regulatory sequences upstream of GSDMB. GSDMB was not expressed in the majority of normal gastric-tissue samples, and the expression level was very low in the few normal samples with GSDMB expression. Most pre-cancer samples showed moderate GSDMB expression, and most cancerous samples showed augmented GSDMB expression. Analysis of genome sequences revealed that an Alu element resides in the 5' region upstream of GSDMB. Reporter assays using intact, deleted, and mutated Alu elements clearly showed that this Alu element positively regulates GSDMB expression and that a putative IKZF binding motif in this element is crucial to upregulate GSDMB expression.
RNAmutants: a web server to explore the mutational landscape of RNA secondary structures
Waldispühl, Jerome; Devadas, Srinivas; Berger, Bonnie; Clote, Peter
2009-01-01
The history and mechanism of molecular evolution in DNA have been greatly elucidated by contributions from genetics, probability theory and bioinformatics—indeed, mathematical developments such as Kimura's neutral theory, Kingman's coalescent theory and efficient software such as BLAST, ClustalW, Phylip, etc., provide the foundation for modern population genetics. In contrast to DNA, the function of most noncoding RNA depends on tertiary structure, experimentally known to be largely determined by secondary structure, for which dynamic programming can efficiently compute the minimum free energy secondary structure. For this reason, understanding the effect of pointwise mutations in RNA secondary structure could reveal fundamental properties of structural RNA molecules and improve our understanding of molecular evolution of RNA. The web server RNAmutants provides several efficient tools to compute the ensemble of low-energy secondary structures for all k-mutants of a given RNA sequence, where k is bounded by a user-specified upper bound. As we have previously shown, these tools can be used to predict putative deleterious mutations and to analyze regulatory sequences from the hepatitis C and human immunodeficiency genomes. Web server is available at http://bioinformatics.bc.edu/clotelab/RNAmutants/, and downloadable binaries at http://rnamutants.csail.mit.edu/. PMID:19531740
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Li, De-Zhu; Guo, Zhen-Hua
2012-01-01
Background Transcriptome sequencing can be used to determine gene sequences and transcript abundance in non-model species, and the advent of next-generation sequencing (NGS) technologies has greatly decreased the cost and time required for this process. Transcriptome data are especially desirable in bamboo species, as certain members constitute an economically and culturally important group of mostly semelparous plants with remarkable flowering features, yet little bamboo genomic research has been performed. Here we present, for the first time, extensive sequence and transcript abundance data for the floral transcriptome of a key bamboo species, Dendrocalamus latiflorus, obtained using the Illumina GAII sequencing platform. Our further goal was to identify patterns of gene expression during bamboo flower development. Results Approximately 96 million sequencing reads were generated and assembled de novo, yielding 146,395 high quality unigenes with an average length of 461 bp. Of these, 80,418 were identified as putative homologs of annotated sequences in the public protein databases, of which 290 were associated with the floral transition and 47 were related to flower development. Digital abundance analysis identified 26,529 transcripts differentially enriched between two developmental stages, young flower buds and older developing flowers. Unigenes found at each stage were categorized according to their putative functional categories. These sequence and putative function data comprise a resource for future investigation of the floral transition and flower development in bamboo species. Conclusions Our results present the first broad survey of a bamboo floral transcriptome. Although it will be necessary to validate the functions carried out by these genes, these results represent a starting point for future functional research on D. latiflorus and related species. PMID:22916120
RSAT 2015: Regulatory Sequence Analysis Tools
Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2015-01-01
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632
Schneider, Ralf F; Li, Yuanhao; Meyer, Axel; Gunter, Helen M
2014-09-01
Phenotypic plasticity is the ability of organisms with a given genotype to develop different phenotypes according to environmental stimuli, resulting in individuals that are better adapted to local conditions. In spite of their ecological importance, the developmental regulatory networks underlying plastic phenotypes often remain uncharacterized. We examined the regulatory basis of diet-induced plasticity in the lower pharyngeal jaw (LPJ) of the cichlid fish Astatoreochromis alluaudi, a model species in the study of adaptive plasticity. Through raising juvenile A. alluaudi on either a hard or soft diet (hard-shelled or pulverized snails) for between 1 and 8 months, we gained insight into the temporal regulation of 19 previously identified candidate genes during the early stages of plasticity development. Plasticity in LPJ morphology was first detected between 3 and 5 months of diet treatment. The candidate genes, belonging to various functional categories, displayed dynamic expression patterns that consistently preceded the onset of morphological divergence and putatively contribute to the initiation of the plastic phenotypes. Within functional categories, we observed striking co-expression, and transcription factor binding site analysis was used to examine the prospective basis of their coregulation. We propose a regulatory network of LPJ plasticity in cichlids, presenting evidence for regulatory crosstalk between bone and muscle tissues, which putatively facilitates the development of this highly integrated trait. Through incorporating a developmental time-course into a phenotypic plasticity study, we have identified an interconnected, environmentally responsive regulatory network that shapes the development of plasticity in a key innovation of East African cichlids. © 2014 John Wiley & Sons Ltd.
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.
1987-01-01
The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
EphB4 localises to the nucleus of prostate cancer cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mertens-Walker, Inga, E-mail: inga.mertenswalker@qut.edu.au; Australian Prostate Cancer Research Centre—Queensland, Translational Research Institute, 37 Kent Street, Woolloongabba 4102, QLD; Lisle, Jessica E.
2015-04-10
The EphB4 receptor tyrosine kinase is over-expressed in a variety of different epithelial cancers including prostate where it has been shown to be involved in survival, migration and angiogenesis. We report here that EphB4 also resides in the nucleus of prostate cancer cell lines. We used in silico methods to identify a bipartite nuclear localisation signal (NLS) in the extracellular domain and a monopartite NLS sequence in the intracellular kinase domain of EphB4. To determine whether both putative NLS sequences were functional, fragments of the EphB4 sequence containing each NLS were cloned to create EphB4NLS-GFP fusion proteins. Localisation of bothmore » NLS-GFP proteins to the nuclei of transfected cells was observed, demonstrating that EphB4 contains two functional NLS sequences. Mutation of the key amino residues in both NLS sequences resulted in diminished nuclear accumulation. As nuclear translocation is often dependent on importins we confirmed that EphB4 and importin-α can interact. To assess if nuclear EphB4 could be implicated in gene regulatory functions potential EphB4-binding genomic loci were identified using chromatin immunoprecipitation and Lef1 was confirmed as a potential target of EphB4-mediated gene regulation. These novel findings add further complexity to the biology of this important cancer-associated receptor. - Highlights: • The EphB4 protein can be found in the nucleus of prostate cancer cell lines. • EphB4 contains two functional nuclear localisation signals. • Chromatin immunoprecipitation has identified potential genome sequences to which EphB4 binds. • Lef1 is a confirmed target for EphB4-mediated gene regulation.« less
PuTmiR: A database for extracting neighboring transcription factors of human microRNAs
2010-01-01
Background Some of the recent investigations in systems biology have revealed the existence of a complex regulatory network between genes, microRNAs (miRNAs) and transcription factors (TFs). In this paper, we focus on TF to miRNA regulation and provide a novel interface for extracting the list of putative TFs for human miRNAs. A putative TF of an miRNA is considered here as those binding within the close genomic locality of that miRNA with respect to its starting or ending base pair on the chromosome. Recent studies suggest that these putative TFs are possible regulators of those miRNAs. Description The interface is built around two datasets that consist of the exhaustive lists of putative TFs binding respectively in the 10 kb upstream region (USR) and downstream region (DSR) of human miRNAs. A web server, named as PuTmiR, is designed. It provides an option for extracting the putative TFs for human miRNAs, as per the requirement of a user, based on genomic locality, i.e., any upstream or downstream region of interest less than 10 kb. The degree distributions of the number of putative TFs and miRNAs against each other for the 10 kb USR and DSR are analyzed from the data and they explore some interesting results. We also report about the finding of a significant regulatory activity of the YY1 protein over a set of oncomiRNAs related to the colon cancer. Conclusion The interface provided by the PuTmiR web server provides an important resource for analyzing the direct and indirect regulation of human miRNAs. While it is already an established fact that miRNAs are regulated by TFs binding to their USR, this database might possibly help to study whether an miRNA can also be regulated by the TFs binding to their DSR. PMID:20398296
Transcriptional analysis of the bglP gene from Streptococcus mutans.
Cote, Christopher K; Honeyman, Allen L
2006-04-21
An open reading frame encoding a putative antiterminator protein, LicT, was identified in the genomic sequence of Streptococcus mutans. A potential ribonucleic antitermination (RAT) site to which the LicT protein would potentially bind has been identified immediately adjacent to this open reading frame. The licT gene and RAT site are both located 5' to a beta-glucoside PTS regulon previously described in S. mutans that is responsible for esculin utilization in the presence of glucose. It was hypothesized that antitermination is the regulatory mechanism that is responsible for the control of the bglP gene expression, which encodes an esculin-specific PTS enzyme II. To localize the promoter activity associated with the bglP locus, a series of transcriptional lacZ gene fusions was formed on a reporter shuttle vector using various DNA fragments from the bglP promoter region. Subsequent beta-galactosidase assays in S. mutans localized the bglP promoter region and identified putative -35 and -10 promoter elements. Primer extension analysis identified the bglP transcriptional start site. In addition, a terminated bglP transcript formed by transcriptional termination was identified via transcript mapping experiments. The physical location of these genetic elements, the RAT site and the promoter regions, and the identification of a short terminated mRNA support the hypothesis that antitermination regulates the bglP transcript.
Transcriptional analysis of the bglP gene from Streptococcus mutans
Cote, Christopher K; Honeyman, Allen L
2006-01-01
Background An open reading frame encoding a putative antiterminator protein, LicT, was identified in the genomic sequence of Streptococcus mutans. A potential ribonucleic antitermination (RAT) site to which the LicT protein would potentially bind has been identified immediately adjacent to this open reading frame. The licT gene and RAT site are both located 5' to a beta-glucoside PTS regulon previously described in S. mutans that is responsible for esculin utilization in the presence of glucose. It was hypothesized that antitermination is the regulatory mechanism that is responsible for the control of the bglP gene expression, which encodes an esculin-specific PTS enzyme II. Results To localize the promoter activity associated with the bglP locus, a series of transcriptional lacZ gene fusions was formed on a reporter shuttle vector using various DNA fragments from the bglP promoter region. Subsequent beta-galactosidase assays in S. mutans localized the bglP promoter region and identified putative -35 and -10 promoter elements. Primer extension analysis identified the bglP transcriptional start site. In addition, a terminated bglP transcript formed by transcriptional termination was identified via transcript mapping experiments. Conclusion The physical location of these genetic elements, the RAT site and the promoter regions, and the identification of a short terminated mRNA support the hypothesis that antitermination regulates the bglP transcript. PMID:16630357
Doblas, Verónica G; Amorim-Silva, Vítor; Posé, David; Rosado, Abel; Esteban, Alicia; Arró, Montserrat; Azevedo, Herlander; Bombarely, Aureliano; Borsani, Omar; Valpuesta, Victoriano; Ferrer, Albert; Tavares, Rui M; Botella, Miguel A
2013-02-01
The 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR) enzyme catalyzes the major rate-limiting step of the mevalonic acid (MVA) pathway from which sterols and other isoprenoids are synthesized. In contrast with our extensive knowledge of the regulation of HMGR in yeast and animals, little is known about this process in plants. To identify regulatory components of the MVA pathway in plants, we performed a genetic screen for second-site suppressor mutations of the Arabidopsis thaliana highly drought-sensitive drought hypersensitive2 (dry2) mutant that shows decreased squalene epoxidase activity. We show that mutations in SUPPRESSOR OF DRY2 DEFECTS1 (SUD1) gene recover most developmental defects in dry2 through changes in HMGR activity. SUD1 encodes a putative E3 ubiquitin ligase that shows sequence and structural similarity to yeast Degradation of α factor (Doα10) and human TEB4, components of the endoplasmic reticulum-associated degradation C (ERAD-C) pathway. While in yeast and animals, the alternative ERAD-L/ERAD-M pathway regulates HMGR activity by controlling protein stability, SUD1 regulates HMGR activity without apparent changes in protein content. These results highlight similarities, as well as important mechanistic differences, among the components involved in HMGR regulation in plants, yeast, and animals.
Różycka, Mirosława; Wojtas, Magdalena; Jakób, Michał; Stigloher, Christian; Grzeszkowiak, Mikołaj; Mazur, Maciej; Ożyhar, Andrzej
2014-01-01
Fish otoliths, biominerals composed of calcium carbonate with a small amount of organic matrix, are involved in the functioning of the inner ear. Starmaker (Stm) from zebrafish (Danio rerio) was the first protein found to be capable of controlling the formation of otoliths. Recently, a gene was identified encoding the Starmaker-like (Stm-l) protein from medaka (Oryzias latipes), a putative homologue of Stm and human dentine sialophosphoprotein. Although there is no sequence similarity between Stm-l and Stm, Stm-l was suggested to be involved in the biomineralization of otoliths, as had been observed for Stm even before. The molecular properties and functioning of Stm-l as a putative regulatory protein in otolith formation have not been characterized yet. A comprehensive biochemical and biophysical analysis of recombinant Stm-l, along with in silico examinations, indicated that Stm-l exhibits properties of a coil-like intrinsically disordered protein. Stm-l possesses an elongated and pliable structure that is able to adopt a more ordered and rigid conformation under the influence of different factors. An in vitro assay of the biomineralization activity of Stm-l indicated that Stm-l affected the size, shape and number of calcium carbonate crystals. The functional significance of intrinsically disordered properties of Stm-l and the possible role of this protein in controlling the formation of calcium carbonate crystals is discussed.
Różycka, Mirosława; Wojtas, Magdalena; Jakób, Michał; Stigloher, Christian; Grzeszkowiak, Mikołaj; Mazur, Maciej; Ożyhar, Andrzej
2014-01-01
Fish otoliths, biominerals composed of calcium carbonate with a small amount of organic matrix, are involved in the functioning of the inner ear. Starmaker (Stm) from zebrafish (Danio rerio) was the first protein found to be capable of controlling the formation of otoliths. Recently, a gene was identified encoding the Starmaker-like (Stm-l) protein from medaka (Oryzias latipes), a putative homologue of Stm and human dentine sialophosphoprotein. Although there is no sequence similarity between Stm-l and Stm, Stm-l was suggested to be involved in the biomineralization of otoliths, as had been observed for Stm even before. The molecular properties and functioning of Stm-l as a putative regulatory protein in otolith formation have not been characterized yet. A comprehensive biochemical and biophysical analysis of recombinant Stm-l, along with in silico examinations, indicated that Stm-l exhibits properties of a coil-like intrinsically disordered protein. Stm-l possesses an elongated and pliable structure that is able to adopt a more ordered and rigid conformation under the influence of different factors. An in vitro assay of the biomineralization activity of Stm-l indicated that Stm-l affected the size, shape and number of calcium carbonate crystals. The functional significance of intrinsically disordered properties of Stm-l and the possible role of this protein in controlling the formation of calcium carbonate crystals is discussed. PMID:25490041
Doblas, Verónica G.; Amorim-Silva, Vítor; Posé, David; Rosado, Abel; Esteban, Alicia; Arró, Montserrat; Azevedo, Herlander; Bombarely, Aureliano; Borsani, Omar; Valpuesta, Victoriano; Ferrer, Albert; Tavares, Rui M.; Botella, Miguel A.
2013-01-01
The 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR) enzyme catalyzes the major rate-limiting step of the mevalonic acid (MVA) pathway from which sterols and other isoprenoids are synthesized. In contrast with our extensive knowledge of the regulation of HMGR in yeast and animals, little is known about this process in plants. To identify regulatory components of the MVA pathway in plants, we performed a genetic screen for second-site suppressor mutations of the Arabidopsis thaliana highly drought-sensitive drought hypersensitive2 (dry2) mutant that shows decreased squalene epoxidase activity. We show that mutations in SUPPRESSOR OF DRY2 DEFECTS1 (SUD1) gene recover most developmental defects in dry2 through changes in HMGR activity. SUD1 encodes a putative E3 ubiquitin ligase that shows sequence and structural similarity to yeast Degradation of α factor (Doα10) and human TEB4, components of the endoplasmic reticulum–associated degradation C (ERAD-C) pathway. While in yeast and animals, the alternative ERAD-L/ERAD-M pathway regulates HMGR activity by controlling protein stability, SUD1 regulates HMGR activity without apparent changes in protein content. These results highlight similarities, as well as important mechanistic differences, among the components involved in HMGR regulation in plants, yeast, and animals. PMID:23404890
Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X
2016-06-24
Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.
Kream, Richard M; Sheehan, Melinda; Cadet, Patrick; Mantione, Kirk J; Zhu, Wei; Casares, Federico; Stefano, George B
2007-12-01
Biochemical, molecular and pharmacological evidence for two unique six-transmembrane helical (TMH) domain opiate receptors expressed from the micro opioid receptor (MOR) gene have been shown. Designated micro3 and micro4 receptors, both protein species are Class A rhodopsin-like members of the superfamily of G-protein coupled receptors but are selectively tailored to mediate the cellular regulatory effects of endogenous morphine and related morphinan alkaloids via stimulation of nitric oxide (NO) production and release. Both micro3 and micro4 receptors lack an amino acid sequence of approximately 90 amino acids that constitute the extracellular N-terminal and TMH1 domains and part of the first intracellular loop of the micro1 receptor, but retain the empirically defined ligand binding pocket distributed across conserved TMH2, TMH3, and TMH7 domains of the micro1 sequence. Additionally, the receptor proteins are terminated by unique intracellular C-terminal amino acid sequences that serve as putative coupling or docking domains required for constitutive NO synthase activation. Because the recognition profile of micro3 and micro4 receptors is restricted to rigid benzylisoquinoline alkaloids typified by morphine and its extended family of chemical congeners, it is hypothesized that conformational stabilization provided by interaction of extended extracellular N-terminal protein domains and the extracellular loops is required for binding of endogenous opioid peptides as well as synthetic flexible opiate alkaloids.
Davlieva, Milya; Shi, Yiwen; Leonard, Paul G.; ...
2015-04-19
LiaR is a ‘master regulator’ of the cell envelope stress response in enterococci and many other Gram-positive organisms. Mutations to liaR can lead to antibiotic resistance to a variety of antibiotics including the cyclic lipopeptide daptomycin. LiaR is phosphorylated in response to membrane stress to regulate downstream target operons. Using DNA footprinting of the regions upstream of the liaXYZ and liaFSR operons we show that LiaR binds an extended stretch of DNA that extends beyond the proposed canonical consensus sequence suggesting a more complex level of regulatory control of target operons. We go on to determine the biochemical and structuralmore » basis for increased resistance to daptomycin by the adaptive mutation to LiaR (D191N) first identified from the pathogen Enterococcus faecalis S613. LiaR D191N increases oligomerization of LiaR to form a constitutively activated tetramer that has high affinity for DNA even in the absence of phosphorylation leading to increased resistance. The crystal structures of the LiaR DNA binding domain complexed to the putative consensus sequence as well as an adjoining secondary sequence show that upon binding, LiaR induces DNA bending that is consistent with increased recruitment of RNA polymerase to the transcription start site and upregulation of target operons.« less
Evidence for free-living Bacteroides in Cladophora along the shores of the Great Lakes
Whitman, Richard L.; Byappanahalli, Muruleedhara; Spoljaric, Ashley; Przybyla-Kelly, Katarzyna; Shively, Dawn A.; Nevers, Meredith
2014-01-01
Bacteroides is assumed to be restricted to the alimentary canal of animals and humans and is considered to be non-viable in ambient environments. We hypothesized that Bacteroides could persist and replicate within beach-stranded Cladophora glomerata mats in southern Lake Michigan, USA. Mean Bacteroides concentration (per GenBac3 Taqman quantitative PCR assay) during summer 2012 at Jeorse Park Beach was 5.2 log calibrator cell equivalents (CCE) g-1 dry weight (dw), ranging from 3.7 to 6.7. We monitored a single beach-stranded mat for 3 wk; bacterial concentrations increased by 1.6 log CCE g-1 dw and correlated significantly with ambient temperature (p = 0.003). Clonal growth was evident, as observed by >99% nucleotide sequence similarity among clones. In in vitro studies, Bacteroides concentrations increased by 5.5 log CCE g-1 after 7 d (27°C) in fresh Cladophora collected from rocks. Partial sequencing of the 16S rRNA gene of 36 clones from the incubation experiment showed highly similar genotypes (≥97% sequence overlap). The closest enteric Bacteroides spp. from the National Center for Biotechnology Information database were only 87 to 91% similar. Genomic similarity, clonality, growth, and persistence collectively suggest that putative, free-living Bacteroides inhabit Cladophora mats of southern Lake Michigan. These findings may have important biological, medical, regulatory, microbial source tracking, and public health implications.
Biedrzycka, Aleksandra; Kloch, Agnieszka; Migalska, Magdalena; Bielański, Wojciech
2013-05-01
We characterized partial sequences of 18S rDNA from sedge warblers infected with a parasite described previously as Hepatozoon kabeeni. Prevalence was 47% in sampled birds.We detected 3 parasite haplotypes in 62 sequenced samples from infected animals. In phylogenetic analyses, 2 of the putative Hepatozoon haplotypes closely resembled Lankesterella minima and L. valsainensis. The third haplotype grouped in a wider clade composed of Caryospora and Eimeria. None of the haplotypes showed resemblance to sequences of Hepatozoon from reptiles and mammals. Molecular detection results were consistent with those from microscopy of stained blood smears, confirming that the primers indeed amplified the parasite sequences. Here we provide evidence that the avian Hepatozoon-like parasites are most likely Lankesterella, supporting the suggestion that the systematic position of avian Hepatozoon-like species needs to be revised.
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.
Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C
2008-12-01
A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
Guerrero-Vargas, Jimmy A.; Mourão, Caroline B. F.; Quintero-Hernández, Verónica; Possani, Lourival D.; Schwartz, Elisabeth F.
2012-01-01
Background Colombia and Brazil are affected by severe cases of scorpionism. In Colombia the most dangerous accidents are caused by Tityus pachyurus that is widely distributed around this country. In the Brazilian Amazonian region scorpion stings are a common event caused by Tityus obscurus. The main objective of this work was to perform the molecular cloning of the putative Na+-channel scorpion toxins (NaScTxs) from T. pachyurus and T. obscurus venom glands and to analyze their phylogenetic relationship with other known NaScTxs from Tityus species. Methodology/Principal Findings cDNA libraries from venom glands of these two species were constructed and five nucleotide sequences from T. pachyurus were identified as putative modulators of Na+-channels, and were named Tpa4, Tpa5, Tpa6, Tpa7 and Tpa8; the latter being the first anti-insect excitatory β-class NaScTx in Tityus scorpion venom to be described. Fifteen sequences from T. obscurus were identified as putative NaScTxs, among which three had been previously described, and the others were named To4 to To15. The peptides Tpa4, Tpa5, Tpa6, To6, To7, To9, To10 and To14 are closely related to the α-class NaScTxs, whereas Tpa7, Tpa8, To4, To8, To12 and To15 sequences are more related to the β-class NaScTxs. To5 is possibly an arthropod specific toxin. To11 and To13 share sequence similarities with both α and β NaScTxs. By means of phylogenetic analysis using the Maximum Parsimony method and the known NaScTxs from Tityus species, these toxins were clustered into 14 distinct groups. Conclusions/Significance This communication describes new putative NaScTxs from T. pachyurus and T. obscurus and their phylogenetic analysis. The results indicate clear geographic separation between scorpions of Tityus genus inhabiting the Amazonian and Mountain Andes regions and those distributed over the Southern of the Amazonian rainforest. Based on the consensus sequences for the different clusters, a new nomenclature for the NaScTxs is proposed. PMID:22355312
Alvarez-Martin, Pablo; Fernández, Matilde; O'Connell-Motherway, Mary; O'Connell, Kerry Joan; Sauvageot, Nicolas; Fitzgerald, Gerald F; MacSharry, John; Zomer, Aldert; van Sinderen, Douwe
2012-08-01
This work reports on the identification and molecular characterization of the two-component regulatory system (2CRS) PhoRP, which controls the response to inorganic phosphate (P(i)) starvation in Bifidobacterium breve UCC2003. The response regulator PhoP was shown to bind to the promoter region of pstSCAB, specifying a predicted P(i) transporter system, as well as that of phoU, which encodes a putative P(i)-responsive regulatory protein. This interaction is assumed to cause transcriptional modulation under conditions of P(i) limitation. Our data suggest that the phoRP genes are subject to positive autoregulation and, together with pstSCAB and presumably phoU, represent the complete regulon controlled by the phoRP-encoded 2CRS in B. breve UCC2003. Determination of the minimal PhoP binding region combined with bioinformatic analysis revealed the probable recognition sequence of PhoP, designated here as the PHO box, which together with phoRP is conserved among many high-GC-content Gram-positive bacteria. The importance of the phoRP 2CRS in the response of B. breve to P(i) starvation conditions was confirmed by analysis of a B. breve phoP insertion mutant which exhibited decreased growth under phosphate-limiting conditions compared to its parent strain UCC2003.
Alvarez-Martin, Pablo; Fernández, Matilde; O'Connell-Motherway, Mary; O'Connell, Kerry Joan; Sauvageot, Nicolas; Fitzgerald, Gerald F.; MacSharry, John; Zomer, Aldert
2012-01-01
This work reports on the identification and molecular characterization of the two-component regulatory system (2CRS) PhoRP, which controls the response to inorganic phosphate (Pi) starvation in Bifidobacterium breve UCC2003. The response regulator PhoP was shown to bind to the promoter region of pstSCAB, specifying a predicted Pi transporter system, as well as that of phoU, which encodes a putative Pi-responsive regulatory protein. This interaction is assumed to cause transcriptional modulation under conditions of Pi limitation. Our data suggest that the phoRP genes are subject to positive autoregulation and, together with pstSCAB and presumably phoU, represent the complete regulon controlled by the phoRP-encoded 2CRS in B. breve UCC2003. Determination of the minimal PhoP binding region combined with bioinformatic analysis revealed the probable recognition sequence of PhoP, designated here as the PHO box, which together with phoRP is conserved among many high-GC-content Gram-positive bacteria. The importance of the phoRP 2CRS in the response of B. breve to Pi starvation conditions was confirmed by analysis of a B. breve phoP insertion mutant which exhibited decreased growth under phosphate-limiting conditions compared to its parent strain UCC2003. PMID:22635988
2011-01-01
Background Herbaspirillum seropedicae SmR1 is a nitrogen fixing endophyte associated with important agricultural crops. It produces polyhydroxybutyrate (PHB) which is stored intracellularly as granules. However, PHB metabolism and regulatory control is not yet well studied in this organism. Results In this work we describe the characterization of the PhbF protein from H. seropedicae SmR1 which was purified and characterized after expression in E. coli. The purified PhbF protein was able to bind to eleven putative promoters of genes involved in PHB metabolism in H. seropedicae SmR1. In silico analyses indicated a probable DNA-binding sequence which was shown to be protected in DNA footprinting assays using purified PhbF. Analyses using lacZ fusions showed that PhbF can act as a repressor protein controlling the expression of PHB metabolism-related genes. Conclusions Our results indicate that H. seropedicae SmR1 PhbF regulates expression of phb-related genes by acting as a transcriptional repressor. The knowledge of the PHB metabolism of this plant-associated bacterium may contribute to the understanding of the plant-colonizing process and the organism's resistance and survival in planta. PMID:21999748
Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence.
Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J
2017-04-01
Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on Alcoholism.
Laurie, Andrew D.; Lloyd-Jones, Gareth
1999-01-01
Cloning and molecular ecological studies have underestimated the diversity of polycyclic aromatic hydrocarbon (PAH) catabolic genes by emphasizing classical nah-like (nah, ndo, pah, and dox) sequences. Here we report the description of a divergent set of PAH catabolic genes, the phn genes, which although isofunctional to the classical nah-like genes, show very low homology. This phn locus, which contains nine open reading frames (ORFs), was isolated on an 11.5-kb HindIII fragment from phenanthrene-degrading Burkholderia sp. strain RP007. The phn genes are significantly different in sequence and gene order from previously characterized genes for PAH degradation. They are transcribed by RP007 when grown at the expense of either naphthalene or phenanthrene, while in Escherichia coli the recombinant phn enzymes have been shown to be capable of oxidizing both naphthalene and phenanthrene to predicted metabolites. The locus encodes iron sulfur protein α and β subunits of a PAH initial dioxygenase but lacks the ferredoxin and reductase components. The dihydrodiol dehydrogenase of the RP007 pathway, PhnB, shows greater similarity to analogous dehydrogenases from described biphenyl pathways than to those characterized from naphthalene/phenanthrene pathways. An unusual extradiol dioxygenase, PhnC, shows no similarity to other extradiol dioxygenases for naphthalene or biphenyl oxidation but is the first member of the recently proposed class III extradiol dioxygenases that is specific for polycyclic arene diols. Upstream of the phn catabolic genes are two putative regulatory genes, phnR and phnS. Sequence homology suggests that phnS is a LysR-type transcriptional activator and that phnR, which is divergently transcribed with respect to phnSFECDAcAdB, is a member of the ς54-dependent family of positive transcriptional regulators. Reverse transcriptase PCR experiments suggest that this gene cluster is coordinately expressed and is under regulatory control which may involve PhnR and PhnS. PMID:9882667
USDA-ARS?s Scientific Manuscript database
Lipase (lip) and lipase-specific foldase (lif) genes of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans NRRL B-2649 were cloned using primers based on consensus sequences, followed by PCR-based genome walking. Sequence analyses showed a putative Lip gene-product (...
Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M
2012-02-01
Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.
Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun
2018-03-01
Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.
2012-01-01
Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage-specific or species-specific regulatory fine-tuners. This synergy may be critical for finer-scale spatio-temporal regulation, hence unique expression profiles of homologous transcription factors from different species with distinct zones of ecological adaptation such as rice, sorghum and Arabidopsis. The patterns revealed from these comparisons set the stage for further empirical validation by functional genomics. PMID:22992304
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
Almeida, Tânia; Menéndez, Esther; Capote, Tiago; Ribeiro, Teresa; Santos, Conceição; Gonçalves, Sónia
2013-01-15
The molecular processes associated with cork development in Quercus suber L. are poorly understood. A previous molecular approach identified a list of genes potentially important for cork formation and differentiation, providing a new basis for further molecular studies. This report is the first molecular characterization of one of these candidate genes, QsMYB1, coding for an R2R3-MYB transcription factor. The R2R3-MYB gene sub-family has been described as being involved in the phenylpropanoid and lignin pathways, both involved in cork biosynthesis. The results showed that the expression of QsMYB1 is putatively mediated by an alternative splicing (AS) mechanism that originates two different transcripts (QsMYB1.1 and QsMYB1.2), differing only in the 5'-untranslated region, due to retention of the first intron in one of the variants. Moreover, within the retained intron, a simple sequence repeat (SSR) was identified. The upstream regulatory region of QsMYB1 was extended by a genome walking approach, which allowed the identification of the putative gene promoter region. The relative expression pattern of QsMYB1 transcripts determined by reverse transcription quantitative polymerase chain reaction (RT-qPCR) revealed that both transcripts were up-regulated in cork tissues; the detected expression was several times higher in newly formed cork harvested from trees producing virgin, second or reproduction cork when compared with wood. Moreover, the expression analysis of QsMYB1 in several Q. suber organs showed very low expression in young branches and roots, whereas in leaves, immature acorns or male flowers, no expression was detected. These preliminary results suggest that QsMYB1 may be related to secondary growth and, in particular, with the cork biosynthesis process with a possible alternative splicing mechanism associated with its regulatory function. Copyright © 2012 Elsevier GmbH. All rights reserved.
Hyndman, Timothy H; Marschang, Rachel E; Wellehan, James F X; Nicholls, Philip K
2012-10-01
This paper describes the isolation and molecular identification of a novel paramyxovirus found during an investigation of an outbreak of neurorespiratory disease in a collection of Australian pythons. Using Illumina® high-throughput sequencing, a 17,187 nucleotide sequence was assembled from RNA extracts from infected viper heart cells (VH2) displaying widespread cytopathic effects in the form of multinucleate giant cells. The sequence appears to contain all the coding regions of the genome, including the following predicted paramyxoviral open reading frames (ORFs): 3'--Nucleocapsid (N)--putative Phosphoprotein (P)--Matrix (M)--Fusion (F)--putative attachment protein--Polymerase (L)--5'. There is also a 540 nucleotide ORF between the N and putative P genes that may be an additional coding region. Phylogenetic analyses of the complete N, M, F and L genes support the clustering of this virus within the family Paramyxoviridae but outside both of the current subfamilies: Paramyxovirinae and Pneumovirinae. We propose to name this new virus, Sunshine virus, after the geographic origin of the first isolate--the Sunshine Coast of Queensland, Australia. Copyright © 2012 Elsevier B.V. All rights reserved.
Regulatory protein BBD18 of the lyme disease spirochete: essential role during tick acquisition?
Hayes, Beth M; Dulebohn, Daniel P; Sarkar, Amit; Tilly, Kit; Bestor, Aaron; Ambroggio, Xavier; Rosa, Patricia A
2014-04-01
The Lyme disease spirochete Borrelia burgdorferi senses and responds to environmental cues as it transits between the tick vector and vertebrate host. Failure to properly adapt can block transmission of the spirochete and persistence in either vector or host. We previously identified BBD18, a novel plasmid-encoded protein of B. burgdorferi, as a putative repressor of the host-essential factor OspC. In this study, we investigate the in vivo role of BBD18 as a regulatory protein, using an experimental mouse-tick model system that closely resembles the natural infectious cycle of B. burgdorferi. We show that spirochetes that have been engineered to constitutively produce BBD18 can colonize and persist in ticks but do not infect mice when introduced by either tick bite or needle inoculation. Conversely, spirochetes lacking BBD18 can persistently infect mice but are not acquired by feeding ticks. Through site-directed mutagenesis, we have demonstrated that abrogation of spirochete infection in mice by overexpression of BBD18 occurs only with bbd18 alleles that can suppress OspC synthesis. Finally, we demonstrate that BBD18-mediated regulation does not utilize a previously described ospC operator sequence required by B. burgdorferi for persistence in immunocompetent mice. These data lead us to conclude that BBD18 does not represent the putative repressor utilized by B. burgdorferi for the specific downregulation of OspC in the mammalian host. Rather, we suggest that BBD18 exhibits features more consistent with those of a global regulatory protein whose critical role occurs during spirochete acquisition by feeding ticks. IMPORTANCE Lyme disease, caused by Borrelia burgdorferi, is the most common arthropod-borne disease in North America. B. burgdorferi is transmitted to humans and other vertebrate hosts by ticks as they take a blood meal. Transmission between vectors and hosts requires the bacterium to sense changes in the environment and adapt. However, the mechanisms involved in this process are not well understood. By determining how B. burgdorferi cycles between two very different environments, we can potentially establish novel ways to interfere with transmission and limit infection of this vector-borne pathogen. We are studying a regulatory protein called BBD18 that we recently described. We found that too much BBD18 interferes with the spirochete's ability to establish infection in mice, whereas too little BBD18 appears to prevent colonization in ticks. Our study provides new insight into key elements of the infectious cycle of the Lyme disease spirochete.
Alpert, Carl-Alfred; Crutz-Le Coq, Anne-Marie; Malleret, Christine; Zagorec, Monique
2003-01-01
The complete nucleotide sequence of the 13-kb plasmid pRV500, isolated from Lactobacillus sakei RV332, was determined. Sequence analysis enabled the identification of genes coding for a putative type I restriction-modification system, two genes coding for putative recombinases of the integrase family, and a region likely involved in replication. The structural features of this region, comprising a putative ori segment containing 11- and 22-bp repeats and a repA gene coding for a putative initiator protein, indicated that pRV500 belongs to the pUCL287 subfamily of theta-type replicons. A 3.7-kb fragment encompassing this region was fused to an Escherichia coli replicon to produce the shuttle vector pRV566 and was observed to be functional in L. sakei for plasmid replication. The L. sakei replicon alone could not support replication in E. coli. Plasmid pRV500 and its derivative pRV566 were determined to be at very low copy numbers in L. sakei. pRV566 was maintained at a reasonable rate over 20 generations in several lactobacilli, such as Lactobacillus curvatus, Lactobacillus casei, and Lactobacillus plantarum, in addition to L. sakei, making it an interesting basis for developing vectors. Sequence relationships with other plasmids are described and discussed. PMID:12957947
RSAT 2015: Regulatory Sequence Analysis Tools.
Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2015-07-01
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sequence evaluation of four specific cDNA libraries for developmental genomics of sunflower.
Tamborindeguy, C; Ben, C; Liboz, T; Gentzbittel, L
2004-04-01
Four different cDNA libraries were constructed from sunflower protoplasts growing under embryogenic and non-embryogenic conditions: one standard library from each condition and two subtractive libraries in opposite sense. A total of 22,876 cDNA clones were obtained and 4800 ESTs were sequenced, giving rise to 2479 high quality ESTs representing an unigene set of 1502 sequences. This set was compared with ESTs represented in public databases using the programs BLASTN and BLASTX, and its members were classified according to putative function using the catalog in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Some 33% of sequences failed to align with existing plant ESTs and therefore represent putative novel genes. The libraries show a low level of redundancy and, on average, 50% of the present ESTs have not been previously reported for sunflower. Several potentially interesting genes were identified, based on their homology with genes involved in animal zygotic division or plant embryogenesis. We also identified two ESTs that show significantly different levels of expression under embryogenic and non-embryogenic conditions. The libraries described here represent an original and valuable resource for the discovery of yet unknown genes putatively involved in dicot embryogenesis and improving our knowledge of the mechanisms involved in polarity acquisition by plant embryos.
Yeoh, Keat-Ai; Othman, Abrizah; Meon, Sariah; Abdullah, Faridah; Ho, Chai-Ling
2012-10-15
Glucanases are enzymes that hydrolyze a variety β-d-glucosidic linkages. Plant β-1,3-glucanases are able to degrade fungal cell walls; and promote the release of cell-wall derived fungal elicitors. In this study, three full-length cDNA sequences encoding oil palm (Elaeis guineensis) glucanases were analyzed. Sequence analyses of the cDNA sequences suggested that EgGlc1-1 is a putative β-d-glucan exohydolase belonging to glycosyl hydrolase (GH) family 3 while EgGlc5-1 and EgGlc5-2 are putative glucan endo-1,3-β-glucosidases belonging to GH family 17. The transcript abundance of these genes in the roots and leaves of oil palm seedlings treated with Ganoderma boninense and Trichoderma harzianum was profiled to investigate the involvement of these glucanases in oil palm during fungal infection. The gene expression of EgGlc1-1 in the root of oil palm seedlings was increased by T. harzianum but suppressed by G. boninense; while the gene expression of both EgGlc5-1 and EgGlc5-2 in the roots of oil palm seedlings was suppressed by G. boninense or/and T. harzianum. Copyright © 2012 Elsevier GmbH. All rights reserved.
Gueli Alletti, Gianpiero; Eigenbrod, Marina; Carstens, Eric B; Kleespies, Regina G; Jehle, Johannes A
2017-06-01
The European isolate Agrotis segetum granulovirus DA (AgseGV-DA) is a slow killing, type I granulovirus due to low dose-mortality responses within seven days post infection and a tissue tropism of infection restricted solely to the fat body of infected Agrotis segetum host larvae. The genome of AgseGV-DA was completely sequenced and compared to the whole genome sequences of the Chinese isolates AgseGV-XJ and AgseGV-L1. All three isolates share highly conserved genomes. The AgseGV-DA genome is 131,557bp in length and encodes for 149 putative open reading frames, including 37 baculovirus core genes and the per os infectivity factor ac110. Comprehensive investigations of repeat regions identified one putative non-hr like origin of replication in AgseGV-DA. Phylogenetic analysis based on concatenated amino acid alignments of 37 baculovirus core genes as well as pairwise distances based on the nucleotide alignments of partial granulin, lef-8 and lef-9 sequences with deposited betabaculoviruses confirmed AgseGV-DA, AgseGV-XJ and AgseGV-L1 as representative isolates of the same Betabaculovirus species. AgseGV encodes for a distinct putative enhancin, distantly related to enhancins from other granuloviruses. Copyright © 2017. Published by Elsevier Inc.
de Kloet, E; de Kloet, S R
2004-12-01
A study was made of the phylogenetic relationships between fifteen complete nucleotide sequences as well as 43 nucleotide sequences of the putative coat protein gene of different strains belonging to the virus species Beak and feather disease virus obtained from 39 individuals of 16 psittacine species. The species included among others, cockatoos ( Cacatuini), African grey parrots ( Psittacus erithacus) and peach-faced lovebirds ( Agapornis roseicollis), which were infected at different geographical locations, within and outside Australia, the native origin of the virus. The derived amino acid sequences of the putative coat protein were highly diverse, with differences between some strains amounting to 50 of the 250 amino acids. Phylogenetic analysis demonstrated that the putative coat gene sequences form six clusters which show a varying degree of psittacine species specificity. Most, but not all strains infecting African grey parrots formed a single cluster as did the strains infecting the cockatoos. Strains infecting the lovebirds clustered with those infecting such Australasian species as Eclectus roratus, Psittacula kramerii and Psephotus haematogaster. Although individual birds included in this study were, where studied, often infected by closely related strains, infection by highly diverged trains was also detected. The possible relationship between BFD viral strains and clinical disease signs is discussed.
Lin, Runmao; He, Liye; He, Jiayu; Qin, Peigang; Wang, Yanran; Deng, Qiming; Yang, Xiaoting; Li, Shuangcheng; Wang, Shiquan; Wang, Wenming; Liu, Huainian; Li, Ping; Zheng, Aiping
2016-07-03
MicroRNAs (miRNAs) are ∼22 nucleotide non-coding RNAs that regulate gene expression by targeting mRNAs for degradation or inhibiting protein translation. To investigate whether miRNAs regulate the pathogenesis in necrotrophic fungus Rhizoctonia solani AG1 IA, which causes significant yield loss in main economically important crops, and to determine the regulatory mechanism occurring during pathogenesis, we constructed hyphal small RNA libraries from six different infection periods of the rice leaf. Through sequencing and analysis, 177 miRNA-like small RNAs (milRNAs) were identified, including 15 candidate pathogenic novel milRNAs predicted by functional annotations of their target mRNAs and expression patterns of milRNAs and mRNAs during infection. Reverse transcription-quantitative polymerase chain reaction results for randomly selected milRNAs demonstrated that our novel comprehensive predictions had a high level of accuracy. In our predicted pathogenic protein-protein interaction network of R. solani, we added the related regulatory milRNAs of these core coding genes into the network, and could understand the relationships among these regulatory factors more clearly at the systems level. Furthermore, the putative pathogenic Rhi-milR-16, which negatively regulates target gene expression, was experimentally validated to have regulatory functions by a dual-luciferase reporter assay. Additionally, 23 candidate rice miRNAs that may involve in plant immunity against R. solani were discovered. This first study on novel pathogenic milRNAs of R. solani AG1 IA and the recognition of target genes involved in pathogenicity, as well as rice miRNAs, participated in defence against R. solani could provide new insights into revealing the pathogenic mechanisms of the severe rice sheath blight disease. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Sela, Noa; Lachman, Oded; Reingold, Victoria; Dombrovsky, Aviv
2013-10-01
A novel virus was detected in watermelon plants (Citrullus lanatus Thunb.) infected with Melon necrotic spot virus (MNSV) using SOLiD next-generation sequence analysis. In addition to the expected MSNV genome, two double-stranded RNA (dsRNA) segments of 1,312 and 1,118 bp were also identified and sequenced from the purified virus preparations. These two dsRNA segments encode two putative partitivirus-related proteins, an RNA-dependent RNA polymerase (RdRP) and a capsid protein, which were sequenced. Genomic-sequence analysis and analysis of phylogenetic relationships indicate that these two dsRNAs together make up the genome of a novel Partitivirus. This virus was found to be closely related to the Pepper cryptic virus 1 and Raphanus sativus cryptic virus. It is suggested that this novel virus putatively named Citrullus lanatus cryptic virus be considered as a new member of the family Partitiviridae.
Ringwald, M; Schuh, R; Vestweber, D; Eistetter, H; Lottspeich, F; Engel, J; Dölz, R; Jähnig, F; Epplen, J; Mayer, S
1987-01-01
We have determined the amino acid sequence of the Ca2+-dependent cell adhesion molecule uvomorulin as it appears on the cell surface. The extracellular part of the molecule exhibits three internally repeated domains of 112 residues which are most likely generated by gene duplication. Each of the repeated domains contains two highly conserved units which could represent putative Ca2+-binding sites. Secondary structure predictions suggest that the putative Ca2+-binding units are located in external loops at the surface of the protein. The protein sequence exhibits a single membrane-spanning region and a cytoplasmic domain. Sequence comparison reveals extensive homology to the chicken L-CAM. Both uvomorulin and L-CAM are identical in 65% of their entire amino acid sequence suggesting a common origin for both CAMs. Images Fig. 1. Fig. 4. Fig. 7. PMID:3501370
Hempel, Niels; Görisch, Helmut; Mern, Demissew S
2013-09-01
Several two-component regulatory systems are known to be involved in the signal transduction pathway of the ethanol oxidation system in Pseudomonas aeruginosa ATCC 17933. These sensor kinases and response regulators are organized in a hierarchical manner. In addition, a cytoplasmic putative iron-containing alcohol dehydrogenase (Fe-ADH) encoded by ercA (PA1991) has been identified to play an essential role in this regulatory network. The gene ercA (PA1991) is located next to ercS, which encodes a sensor kinase. Inactivation of ercA (PA1991) by insertion of a kanamycin resistance cassette created mutant NH1. NH1 showed poor growth on various alcohols. On ethanol, NH1 grew only with an extremely extended lag phase. During the induction period on ethanol, transcription of structural genes exa and pqqABCDEH, encoding components of initial ethanol oxidation in P. aeruginosa, was drastically reduced in NH1, which indicates the regulatory function of ercA (PA1991). However, transcription in the extremely delayed logarithmic growth phase was comparable to that in the wild type. To date, the involvement of an Fe-ADH in signal transduction processes has not been reported.
Hempel, Niels; Görisch, Helmut
2013-01-01
Several two-component regulatory systems are known to be involved in the signal transduction pathway of the ethanol oxidation system in Pseudomonas aeruginosa ATCC 17933. These sensor kinases and response regulators are organized in a hierarchical manner. In addition, a cytoplasmic putative iron-containing alcohol dehydrogenase (Fe-ADH) encoded by ercA (PA1991) has been identified to play an essential role in this regulatory network. The gene ercA (PA1991) is located next to ercS, which encodes a sensor kinase. Inactivation of ercA (PA1991) by insertion of a kanamycin resistance cassette created mutant NH1. NH1 showed poor growth on various alcohols. On ethanol, NH1 grew only with an extremely extended lag phase. During the induction period on ethanol, transcription of structural genes exa and pqqABCDEH, encoding components of initial ethanol oxidation in P. aeruginosa, was drastically reduced in NH1, which indicates the regulatory function of ercA (PA1991). However, transcription in the extremely delayed logarithmic growth phase was comparable to that in the wild type. To date, the involvement of an Fe-ADH in signal transduction processes has not been reported. PMID:23813731
Brown, D P; Idler, K B; Katz, L
1990-01-01
The 18.1-kilobase plasmid pSE211 integrates into the chromosome of Saccharopolyspora erythraea at a specific attB site. Restriction analysis of the integrated plasmid, pSE211int, and adjacent chromosomal sequences allowed identification of attP, the plasmid attachment site. Nucleotide sequencing of attP, attB, attL, and attR revealed a 57-base-pair sequence common to all sites with no duplications of adjacent plasmid or chromosomal sequences in the integrated state, indicating that integration takes place through conservative, reciprocal strand exchange. An analysis of the sequences indicated the presence of a putative gene for Phe-tRNA at attB which is preserved at attL after integration has occurred. A comparison of the attB site for a number of actinomycete plasmids is presented. Integration at attB was also observed when a 2.4-kilobase segment of pSE211 containing attP and the adjacent plasmid sequence was used to transform a pSE211- host. Nucleotide sequencing of this segment revealed the presence of two complete open reading frames (ORFs) and a segment of a third ORF. The ORF adjacent to attP encodes a putative polypeptide 437 amino acids in length that shows similarity, at its C-terminal domain, to sequences of site-specific recombinases of the integrase family. The adjacent ORF encodes a putative 98-amino-acid basic polypeptide that contains a helix-turn-helix motif at its N terminus which corresponds to domains in the Xis proteins of a number of bacteriophages. A proposal for the function of this polypeptide is presented. The deduced amino acid sequence of the third ORF did not reveal similarities to polypeptide sequences in the current data banks. Images FIG. 2 FIG. 3 PMID:2180909
Nomura, M; Tsujimura, A; Begum, N A; Matsumoto, M; Wabiko, H; Toyoshima, K; Seya, T
2000-01-01
The murine membrane cofactor protein (CD46) gene is expressed exclusively in testis, in contrast to human CD46, which is expressed ubiquitously. To elucidate the mechanism of differential CD46 gene expression among species, we cloned entire murine CD46 genomic DNA and possible regulatory regions were placed in the flanking region of the luciferase reporter gene. The reporter gene assay revealed a silencing activity not in the promoter, but in the 3'-flanking region of the gene and the silencer-like element was identified within a 0.2-kb region between 0.6 and 0.8 kb downstream of the stop codon. This silencer-like element was highly similar to that of the pig MHC class-I gene. The introduction of a mutation into this putative silencer element of murine CD46 resulted in an abrogation of the silencing effect. Electrophoretic mobility-shift assay indicated the presence of the binding molecule(s) for this silencer sequence in murine cell lines and tissues. A size difference of the protein-silencer-element complex was observed depending upon the solubilizers used for preparation of the nuclear extracts. A mutated silencer sequence failed to interact with the binding molecules. The level of the binding factor was lower in the testicular germ cells compared with other organs. Thus the silencer element and its binding factor may play a role in transcriptional regulation of murine CD46 gene expression. These results imply that the effects of the CD46 silencer element encompass the innate immune and reproductive systems, and in mice may determine the testicular germ-cell-dominant expression of CD46. PMID:11023821
Functional Organization of hsp70 Cluster in Camel (Camelus dromedarius) and Other Mammals
Garbuz, David G.; Astakhova, Lubov N.; Zatsepina, Olga G.; Arkhipova, Irina R.; Nudler, Eugene; Evgen'ev, Michael B.
2011-01-01
Heat shock protein 70 (Hsp70) is a molecular chaperone providing tolerance to heat and other challenges at the cellular and organismal levels. We sequenced a genomic cluster containing three hsp70 family genes linked with major histocompatibility complex (MHC) class III region from an extremely heat tolerant animal, camel (Camelus dromedarius). Two hsp70 family genes comprising the cluster contain heat shock elements (HSEs), while the third gene lacks HSEs and should not be induced by heat shock. Comparison of the camel hsp70 cluster with the corresponding regions from several mammalian species revealed similar organization of genes forming the cluster. Specifically, the two heat inducible hsp70 genes are arranged in tandem, while the third constitutively expressed hsp70 family member is present in inverted orientation. Comparison of regulatory regions of hsp70 genes from camel and other mammals demonstrates that transcription factor matches with highest significance are located in the highly conserved 250-bp upstream region and correspond to HSEs followed by NF-Y and Sp1 binding sites. The high degree of sequence conservation leaves little room for putative camel-specific regulatory elements. Surprisingly, RT-PCR and 5′/3′-RACE analysis demonstrated that all three hsp70 genes are expressed in camel's muscle and blood cells not only after heat shock, but under normal physiological conditions as well, and may account for tolerance of camel cells to extreme environmental conditions. A high degree of evolutionary conservation observed for the hsp70 cluster always linked with MHC locus in mammals suggests an important role of such organization for coordinated functioning of these vital genes. PMID:22096537
2011-01-01
Background Stenospermocarpy is a mechanism through which certain genotypes of Vitis vinifera L. such as Sultanina produce berries with seeds reduced in size. Stenospermocarpy has not yet been characterized at the molecular level. Results Genetic and physical maps were integrated with the public genomic sequence of Vitis vinifera L. to improve QTL analysis for seedlessness and berry size in experimental progeny derived from a cross of two seedless genotypes. Major QTLs co-positioning for both traits on chromosome 18 defined a 92-kb confidence interval. Functional information from model species including Vitis suggested that VvAGL11, included in this confidence interval, might be the main positional candidate gene responsible for seed and berry development. Characterization of VvAGL11 at the sequence level in the experimental progeny identified several SNPs and INDELs in both regulatory and coding regions. In association analyses performed over three seasons, these SNPs and INDELs explained up to 78% and 44% of the phenotypic variation in seed and berry weight, respectively. Moreover, genetic experiments indicated that the regulatory region has a larger effect on the phenotype than the coding region. Transcriptional analysis lent additional support to the putative role of VvAGL11's regulatory region, as its expression is abolished in seedless genotypes at key stages of seed development. These results transform VvAGL11 into a functional candidate gene for further analyses based on genetic transformation. For breeding purposes, intragenic markers were tested individually for marker assisted selection, and the best markers were those closest to the transcription start site. Conclusion We propose that VvAGL11 is the major functional candidate gene for seedlessness, and we provide experimental evidence suggesting that the seedless phenotype might be caused by variations in its promoter region. Current knowledge of the function of its orthologous genes, its expression profile in Vitis varieties and the strong association between its sequence variation and the degree of seedlessness together indicate that the D-lineage MADS-box gene VvAGL11 corresponds to the Seed Development Inhibitor locus described earlier as a major locus for seedlessness. These results provide new hypotheses for further investigations of the molecular mechanisms involved in seed and berry development. PMID:21447172
Grassi, Angela; Di Camillo, Barbara; Ciccarese, Francesco; Agnusdei, Valentina; Zanovello, Paola; Amadori, Alberto; Finesso, Lorenzo; Indraccolo, Stefano; Toffolo, Gianna Maria
2016-03-12
Inference of gene regulation from expression data may help to unravel regulatory mechanisms involved in complex diseases or in the action of specific drugs. A challenging task for many researchers working in the field of systems biology is to build up an experiment with a limited budget and produce a dataset suitable to reconstruct putative regulatory modules worth of biological validation. Here, we focus on small-scale gene expression screens and we introduce a novel experimental set-up and a customized method of analysis to make inference on regulatory modules starting from genetic perturbation data, e.g. knockdown and overexpression data. To illustrate the utility of our strategy, it was applied to produce and analyze a dataset of quantitative real-time RT-PCR data, in which interferon-α (IFN-α) transcriptional response in endothelial cells is investigated by RNA silencing of two candidate IFN-α modulators, STAT1 and IFIH1. A putative regulatory module was reconstructed by our method, revealing an intriguing feed-forward loop, in which STAT1 regulates IFIH1 and they both negatively regulate IFNAR1. STAT1 regulation on IFNAR1 was object of experimental validation at the protein level. Detailed description of the experimental set-up and of the analysis procedure is reported, with the intent to be of inspiration for other scientists who want to realize similar experiments to reconstruct gene regulatory modules starting from perturbations of possible regulators. Application of our approach to the study of IFN-α transcriptional response modulators in endothelial cells has led to many interesting novel findings and new biological hypotheses worth of validation.
Ruggiero, Maria Valeria; Procaccini, Gabriele
2004-01-01
Halophila stipulacea is a dioecious marine angiosperm, widely distributed along the western coasts of the Indian Ocean and the Red Sea. This species is thought to be a Lessepsian immigrant that entered the Mediterranean Sea from the Red Sea after the opening of the Suez Canal (1869). Previous studies have revealed both high phenotypic and genetic variability in Halophila stipulacea populations from the western Mediterranean basin. In order to test the hypothesis of a Lessepsian introduction, we compare genetic polymorphism between putative native (Red Sea) and introduced (Mediterranean) populations through rDNA ITS region (ITS1-5.8S-ITS2) sequence analysis. A high degree of intraindividual variability of ITS sequences was found. Most of the intragenomic polymorphism was due to pseudogenic sequences, present in almost all individuals. Features of ITS functional sequences and pseudogenes are described. Possible causes for the lack of homogenization of ITS paralogues within individuals are discussed.
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-01-01
Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048
Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P
2018-03-01
Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
Serine protease-related proteins in the malaria mosquito, Anopheles gambiae.
Cao, Xiaolong; Gulati, Mansi; Jiang, Haobo
2017-09-01
Insect serine proteases (SPs) and serine protease homologs (SPHs) participate in digestion, defense, development, and other physiological processes. In mosquitoes, some clip-domain SPs and SPHs (i.e. CLIPs) have been investigated for possible roles in antiparasitic responses. In a recent test aimed at improving quality of gene models in the Anopheles gambiae genome using RNA-seq data, we observed various discrepancies between gene models in AgamP4.5 and corresponding sequences selected from those modeled by Cufflinks, Trinity and Bridger. Here we report a comparative analysis of the 337 SP-related proteins in A. gambiae by examining their domain structures, sequence diversity, chromosomal locations, and expression patterns. One hundred and ten CLIPs contain 1 to 5 clip domains in addition to their protease domains (PDs) or non-catalytic, protease-like domains (PLDs). They are divided into five subgroups: CLIPAs (22) are clip 1-5 -PLD; CLIPBs (29), CLIPCs (12) and CLIPDs (14) are mainly clip-PD; most CLIPEs (33) have a domain structure of PD/PLD-PLD-clip-PLD 0-1 . While expression of the CLIP genes in group-1 is generally low and detected in various tissue- and stage-specific RNA-seq libraries, some putative GPs/GPHs (i.e. single domain gut SPs/SPHs) in group-2 are highly expressed in midgut, whole larva or whole adult libraries. In comparison, 46 SPs, 26 SPHs, and 37 multi-domain SPs/SPHs (i.e. PD/PLD-PLD ≥1 ) in group-3 do not seem to be specifically expressed in digestive tract. There are 16 SPs and 2 SPH containing other types of putative regulatory domains (e.g. LDLa, CUB, Gd). Of the 337 SP and SPH genes, 159 were sorted into 46 groups (2-8 members/group) based on similar phylogenetic tree position, chromosomal location, and expression profile. This information and analysis, including improved gene models and protein sequences, constitute a solid foundation for functional analysis of the SP-related proteins in A. gambiae. Copyright © 2017 Elsevier Ltd. All rights reserved.
Serotype IV Sequence Type 468 Group B Streptococcus Neonatal Invasive Disease, Minnesota, USA.
Teatero, Sarah; Ferrieri, Patricia; Fittipaldi, Nahuel
2016-11-01
To further understand the emergence of serotype IV group B Streptococcus (GBS) invasive disease, we used whole-genome sequencing to characterize 3 sequence type 468 strains isolated from neonates in Minnesota, USA. We found that strains of tetracycline-resistant sequence type 468 GBS have acquired virulence genes from a putative clonal complex 17 GBS donor by recombination.
Detecting false positive sequence homology: a machine learning approach.
Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Bybee, Seth M
2016-02-24
Accurate detection of homologous relationships of biological sequences (DNA or amino acid) amongst organisms is an important and often difficult task that is essential to various evolutionary studies, ranging from building phylogenies to predicting functional gene annotations. There are many existing heuristic tools, most commonly based on bidirectional BLAST searches that are used to identify homologous genes and combine them into two fundamentally distinct classes: orthologs and paralogs. Due to only using heuristic filtering based on significance score cutoffs and having no cluster post-processing tools available, these methods can often produce multiple clusters constituting unrelated (non-homologous) sequences. Therefore sequencing data extracted from incomplete genome/transcriptome assemblies originated from low coverage sequencing or produced by de novo processes without a reference genome are susceptible to high false positive rates of homology detection. In this paper we develop biologically informative features that can be extracted from multiple sequence alignments of putative homologous genes (orthologs and paralogs) and further utilized in context of guided experimentation to verify false positive outcomes. We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data. Almost ~42 % and ~25 % of predicted putative homologies by InParanoid and HaMStR respectively were classified as false positives on experimental data set. Our process increases the quality of output from other clustering algorithms by providing a novel post-processing method that is both fast and efficient at removing low quality clusters of putative homologous genes recovered by heuristic-based approaches.
Transcriptomics of the Bed Bug (Cimex lectularius)
Rajarapu, Swapna P.; Jones, Susan C.; Mittapalli, Omprakash
2011-01-01
Background Bed bugs (Cimex lectularius) are blood-feeding insects poised to become one of the major pests in households throughout the United States. Resistance of C. lectularius to insecticides/pesticides is one factor thought to be involved in its sudden resurgence. Despite its high-impact status, scant knowledge exists at the genomic level for C. lectularius. Hence, we subjected the C. lectularius transcriptome to 454 pyrosequencing in order to identify potential genes involved in pesticide resistance. Methodology and Principal Findings Using 454 pyrosequencing, we obtained a total of 216,419 reads with 79,596,412 bp, which were assembled into 35,646 expressed sequence tags (3902 contigs and 31744 singletons). Nearly 85.9% of the C. lectularius sequences showed similarity to insect sequences, but 44.8% of the deduced proteins of C. lectularius did not show similarity with sequences in the GenBank non-redundant database. KEGG analysis revealed putative members of several detoxification pathways involved in pesticide resistance. Lamprin domains, Protein Kinase domains, Protein Tyrosine Kinase domains and cytochrome P450 domains were among the top Pfam domains predicted for the C. lectularius sequences. An initial assessment of putative defense genes, including a cytochrome P450 and a glutathione-S-transferase (GST), revealed high transcript levels for the cytochrome P450 (CYP9) in pesticide-exposed versus pesticide-susceptible C. lectularius populations. A significant number of single nucleotide polymorphisms (296) and microsatellite loci (370) were predicted in the C. lectularius sequences. Furthermore, 59 putative sequences of Wolbachia were retrieved from the database. Conclusions To our knowledge this is the first study to elucidate the genetic makeup of C. lectularius. This pyrosequencing effort provides clues to the identification of potential detoxification genes involved in pesticide resistance of C. lectularius and lays the foundation for future functional genomics studies. PMID:21283830
Manku, H K; Dhanoa, J K; Kaur, S; Arora, J S; Mukhopadhyay, C S
2017-10-01
MicroRNAs (miRNAs) are small (19-25 base long), non-coding RNAs that regulate post-transcriptional gene expression by cleaving targeted mRNAs in several eukaryotes. The miRNAs play vital roles in multiple biological and metabolic processes, including developmental timing, signal transduction, cell maintenance and differentiation, diseases and cancers. Experimental identification of microRNAs is expensive and lab-intensive. Alternatively, computational approaches for predicting putative miRNAs from genomic or exomic sequences rely on features of miRNAs viz. secondary structures, sequence conservation, minimum free energy index (MFEI) etc. To date, not a single miRNA has been identified in bubaline (Bubalus bubalis), which is an economically important livestock. The present study aims at predicting the putative miRNAs of buffalo using comparative computational approach from buffalo whole genome shotgun sequencing data (INSDC: AWWX00000000.1). The sequences were blasted against the known mammalian miRNA. The obtained miRNAs were then passed through a series of filtration criteria to obtain the set of predicted (putative and novel) bubaline miRNA. Eight miRNAs were selected based on lowest E-value and validated by real time PCR (SYBR green chemistry) using RNU6 as endogenous control. The results from different trails of real time PCR shows that out of selected 8 miRNAs, only 2 (hsa-miR-1277-5p; bta-miR-2285b) are not expressed in bubaline PBMCs. The potential target genes based on their sequence complementarities were then predicted using miRanda. This work is the first report on prediction of bubaline miRNA from whole genome sequencing data followed by experimental validation. The finding could pave the way to future studies in economically important traits in buffalo. Copyright © 2017 Elsevier Ltd. All rights reserved.
Glubb, Dylan M.; Johnatty, Sharon E.; Quinn, Michael C.J.; O’Mara, Tracy A.; Tyrer, Jonathan P.; Gao, Bo; Fasching, Peter A.; Beckmann, Matthias W.; Lambrechts, Diether; Vergote, Ignace; Velez Edwards, Digna R.; Beeghly-Fadiel, Alicia; Benitez, Javier; Garcia, Maria J.; Goodman, Marc T.; Thompson, Pamela J.; Dörk, Thilo; Dürst, Matthias; Modungo, Francesmary; Moysich, Kirsten; Heitz, Florian; du Bois, Andreas; Pfisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y.; Lester, Jenny; Goode, Ellen L.; Cunningham, Julie M.; Winham, Stacey J.; Larson, Melissa C.; McCauley, Bryan M.; Kjær, Susanne Krüger; Jensen, Allan; Schildkraut, Joellen M.; Berchuck, Andrew; Cramer, Daniel W.; Terry, Kathryn L.; Salvesen, Helga B.; Bjorge, Line; Webb, Penny M.; Grant, Peter; Pejovic, Tanja; Moffitt, Melissa; Hogdall, Claus K.; Hogdall, Estrid; Paul, James; Glasspool, Rosalind; Bernardini, Marcus; Tone, Alicia; Huntsman, David; Woo, Michelle; Group, AOCS; deFazio, Anna; Kennedy, Catherine J.; Pharoah, Paul D.P.; MacGregor, Stuart; Chenevix-Trench, Georgia
2017-01-01
We previously identified associations with ovarian cancer outcome at five genetic loci. To identify putatively causal genetic variants and target genes, we prioritized two ovarian outcome loci (1q22 and 19p12) for further study. Bioinformatic and functional genetic analyses indicated that MEF2D and ZNF100 are targets of candidate outcome variants at 1q22 and 19p12, respectively. At 19p12, the chromatin interaction of a putative regulatory element with the ZNF100 promoter region correlated with candidate outcome variants. At 1q22, putative regulatory elements enhanced MEF2D promoter activity and haplotypes containing candidate outcome variants modulated these effects. In a public dataset, MEF2D and ZNF100 expression were both associated with ovarian cancer progression-free or overall survival time. In an extended set of 6,162 epithelial ovarian cancer patients, we found that functional candidates at the 1q22 and 19p12 loci, as well as other regional variants, were nominally associated with patient outcome; however, no associations reached our threshold for statistical significance (p<1×10-5). Larger patient numbers will be needed to convincingly identify any true associations at these loci. PMID:29029385
Yao, Shaolun; Jiang, Chuan; Huang, Ziyue; Torres-Jerez, Ivone; Chang, Junil; Zhang, Heng; Udvardi, Michael; Liu, Renyi; Verdier, Jerome
2016-10-01
Legume research and cultivar development are important for sustainable food production, especially of high-protein seed. Thanks to the development of deep-sequencing technologies, crop species have been taken to the front line, even without completion of their genome sequences. Black-eyed pea (Vigna unguiculata) is a legume species widely grown in semi-arid regions, which has high potential to provide stable seed protein production in a broad range of environments, including drought conditions. The black-eyed pea reference genotype has been used to generate a gene expression atlas of the major plant tissues (i.e. leaf, root, stem, flower, pod and seed), with a developmental time series for pods and seeds. From these various organs, 27 cDNA libraries were generated and sequenced, resulting in more than one billion reads. Following filtering, these reads were de novo assembled into 36 529 transcript sequences that were annotated and quantified across the different tissues. A set of 24 866 unique transcript sequences, called Unigenes, was identified. All the information related to transcript identification, annotation and quantification were stored into a gene expression atlas webserver (http://vugea.noble.org), providing a user-friendly interface and necessary tools to analyse transcript expression in black-eyed pea organs and to compare data with other legume species. Using this gene expression atlas, we inferred details of molecular processes that are active during seed development, and identified key putative regulators of seed maturation. Additionally, we found evidence for conservation of regulatory mechanisms involving miRNA in plant tissues subjected to drought and seeds undergoing desiccation. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini
2014-11-01
MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Sebaihia, Mohammed; Preston, Andrew; Maskell, Duncan J.; Kuzmiak, Holly; Connell, Terry D.; King, Natalie D.; Orndorff, Paul E.; Miyamoto, David M.; Thomson, Nicholas R.; Harris, David; Goble, Arlette; Lord, Angela; Murphy, Lee; Quail, Michael A.; Rutter, Simon; Squares, Robert; Squares, Steven; Woodward, John; Parkhill, Julian; Temple, Louise M.
2006-01-01
Bordetella avium is a pathogen of poultry and is phylogenetically distinct from Bordetella bronchiseptica, Bordetella pertussis, and Bordetella parapertussis, which are other species in the Bordetella genus that infect mammals. In order to understand the evolutionary relatedness of Bordetella species and further the understanding of pathogenesis, we obtained the complete genome sequence of B. avium strain 197N, a pathogenic strain that has been extensively studied. With 3,732,255 base pairs of DNA and 3,417 predicted coding sequences, it has the smallest genome and gene complement of the sequenced bordetellae. In this study, the presence or absence of previously reported virulence factors from B. avium was confirmed, and the genetic bases for growth characteristics were elucidated. Over 1,100 genes present in B. avium but not in B. bronchiseptica were identified, and most were predicted to encode surface or secreted proteins that are likely to define an organism adapted to the avian rather than the mammalian respiratory tracts. These include genes coding for the synthesis of a polysaccharide capsule, hemagglutinins, a type I secretion system adjacent to two very large genes for secreted proteins, and unique genes for both lipopolysaccharide and fimbrial biogenesis. Three apparently complete prophages are also present. The BvgAS virulence regulatory system appears to have polymorphisms at a poly(C) tract that is involved in phase variation in other bordetellae. A number of putative iron-regulated outer membrane proteins were predicted from the sequence, and this regulation was confirmed experimentally for five of these. PMID:16885469
Draft Genome Sequence of Aeromonas caviae Strain 429865 INP, Isolated from a Mexican Patient
Padilla, Juan Carlos A.; Bustos, Patricia; Sánchez-Varela, Alejandro; Palma-Martinez, Ingrid; Arzate-Barbosa, Patricia; García-Pérez, Carlos A.; López-López, María de Jesús; González, Víctor
2015-01-01
Aeromonas caviae is an emerging human pathogen. Here, we report the draft genome sequence of Aeromonas caviae strain 429865 INP which shows the presence of various putative virulence-related genes. PMID:26494682
Lijun Liu; Trevor Ramsay; Matthew S. Zinkgraf; David Sundell; Nathaniel Robert Street; Vladimir Filkov; Andrew Groover
2015-01-01
Identifying transcription factor target genes is essential for modeling the transcriptional networks underlying developmental processes. Here we report a chromatin immunoprecipitation sequencing (ChIP-seq) resource consisting of genome-wide binding regions and associated putative target genes for four Populus homeodomain transcription factors...
Adamczuk, Marcin; Dziewit, Lukasz
2017-01-01
The draft genome of multidrug-resistant Aeromonas sp. ARM81 isolated from a wastewater treatment plant in Warsaw (Poland) was obtained. Sequence analysis revealed multiple genes conferring resistance to aminoglycosides, β-lactams or tetracycline. Three different β-lactamase genes were identified, including an extended-spectrum β-lactamase gene bla PER-1 . The antibiotic susceptibility was experimentally tested. Genome sequencing also allowed us to investigate the plasmidome and transposable mobilome of ARM81. Four plasmids, of which two carry phenotypic modules (i.e., genes encoding a zinc transporter ZitB and a putative glucosyltransferase), and 28 putative transposase genes were identified. The mobility of three insertion sequences (isoforms of previously identified elements ISAs12, ISKpn9 and ISAs26) was confirmed using trap plasmids.
Network analysis of transcriptomics expands regulatory landscapes in Synechococcus sp. PCC 7002
DOE Office of Scientific and Technical Information (OSTI.GOV)
McClure, Ryan S.; Overall, Christopher C.; McDermott, Jason E.
Cyanobacterial regulation of gene expression must contend with a genome organization that lacks apparent functional context, as the majority of cellular processes and metabolic pathways are encoded by genes found at disparate locations across the genome. In addition, the fact that coordinated regulation of cyanobacterial cellular machinery takes place with significantly fewer transcription factors, compared to other Eubacteria, suggests the involvement of post-transcriptional mechanisms and regulatory adaptations which are not fully understood. Global transcript abundance from model cyanobacterium Synechococcus sp. PCC 7002 grown under 42 different conditions was analyzed using context-likelihood of relatedness. The resulting 903-gene network, which was organizedmore » into 11 modules, not only allowed classification of cyanobacterial responses to specific environmental variables but provided insight into the transcriptional network topology and led to the expansion of predicted regulons. When used in conjunction with genome sequence, the global transcript abundance allowed identification of putative post-transcriptional changes in expression as well as novel potential targets of both DNA binding proteins and asRNA regulators. The results offer a new perspective into the multi-level regulation that governs cellular adaptations of fast-growing physiologically robust cyanobacterium Synechococcus sp. PCC 7002 to changing environmental variables. It also extends a methodological knowledge-based framework for studying multi-scale regulatory mechanisms that operate in cyanobacteria. Finally, it provides valuable context for integrating systems-level data to enhance evidence-driven genomic annotation, especially in organisms where traditional context analyses cannot be implemented due to lack of operon-based functional organization.« less
Bhawna; Bonthala, V.S.; Gajula, MNV Prasad
2016-01-01
The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely. Database URL: http://www.multiomics.in/PvTFDB/ PMID:27465131
AtmiRNET: a web-based resource for reconstructing regulatory networks of Arabidopsis microRNAs.
Chien, Chia-Hung; Chiang-Hsieh, Yi-Fan; Chen, Yi-An; Chow, Chi-Nga; Wu, Nai-Yun; Hou, Ping-Fu; Chang, Wen-Chi
2015-01-01
Compared with animal microRNAs (miRNAs), our limited knowledge of how miRNAs involve in significant biological processes in plants is still unclear. AtmiRNET is a novel resource geared toward plant scientists for reconstructing regulatory networks of Arabidopsis miRNAs. By means of highlighted miRNA studies in target recognition, functional enrichment of target genes, promoter identification and detection of cis- and trans-elements, AtmiRNET allows users to explore mechanisms of transcriptional regulation and miRNA functions in Arabidopsis thaliana, which are rarely investigated so far. High-throughput next-generation sequencing datasets from transcriptional start sites (TSSs)-relevant experiments as well as five core promoter elements were collected to establish the support vector machine-based prediction model for Arabidopsis miRNA TSSs. Then, high-confidence transcription factors participate in transcriptional regulation of Arabidopsis miRNAs are provided based on statistical approach. Furthermore, both experimentally verified and putative miRNA-target interactions, whose validity was supported by the correlations between the expression levels of miRNAs and their targets, are elucidated for functional enrichment analysis. The inferred regulatory networks give users an intuitive insight into the pivotal roles of Arabidopsis miRNAs through the crosstalk between miRNA transcriptional regulation (upstream) and miRNA-mediate (downstream) gene circuits. The valuable information that is visually oriented in AtmiRNET recruits the scant understanding of plant miRNAs and will be useful (e.g. ABA-miR167c-auxin signaling pathway) for further research. Database URL: http://AtmiRNET.itps.ncku.edu.tw/ © The Author(s) 2015. Published by Oxford University Press.
Inferring causal genomic alterations in breast cancer using gene expression data
2011-01-01
Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811
Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.
Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James
2015-02-14
Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.
Wei, Dan-Dan; Chen, Er-Hu; Ding, Tian-Bo; Chen, Shi-Chun; Dou, Wei; Wang, Jin-Jun
2013-01-01
Background As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. Methodology/Principal Findings We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61%) unigenes were matched to known proteins in the NCBI non-redundant (Nr) protein database. These unigenes were further functionally annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST) genes, 19 putative carboxyl/cholinesterase (CCE) genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp) genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. Conclusions/Significance We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying insecticide resistance or environmental stress, and will facilitate studies on population genetics for psocids, as well as providing useful information for functional genomic research in the future. PMID:24244605
Zhu, Yu-Cheng; Specht, Charles A; Dittmer, Neal T; Muthukrishnan, Subbaratnam; Kanost, Michael R; Kramer, Karl J
2002-11-01
Glycosyltransferases are enzymes that synthesize oligosaccharides, polysaccharides and glycoconjugates. One type of glycosyltransferase is chitin synthase, a very important enzyme in biology, which is utilized by insects, fungi, and other invertebrates to produce chitin, a polysaccharide of beta-1,4-linked N-acetylglucosamine. Chitin is an important component of the insect's exoskeletal cuticle and gut lining. To identify and characterize a chitin synthase gene of the tobacco hornworm, Manduca sexta, degenerate primers were designed from two highly conserved regions in fungal and nematode chitin synthase protein sequences and then used to amplify a similar region from Manduca cDNA. A full-length cDNA of 5152 nucleotides was assembled for the putative Manduca chitin synthase gene, MsCHS1, and sequencing of genomic DNA verified the contiguity of the sequence. The MsCHS1 cDNA has an ORF of 4692 nucleotides that encodes a transmembrane protein of 1564 amino acid residues with a mass of approximately 179 kDa (GenBank no. AY062175). It is most similar, over its entire length of protein sequence, to putative chitin synthases from other insects and nematodes, with 68% identity to enzymes from both the blow fly, Lucilia cuprina, and the fruit fly, Drosophila melanogaster. The similarity with fungal chitin synthases is restricted to the putative catalytic domain, and the MsCHS1 protein has, at equivalent positions, several amino acids that are essential for activity as revealed by mutagenesis of the fungal enzymes. A 5.3-kb transcript of MsCHS1 was identified by northern blot hybridization of RNA from larval epidermis, suggesting that the enzyme functions to make chitin deposited in the cuticle. Further examination by RT-PCR showed that MsCHS1 expression is regulated in the epidermis, with the amount of transcript increasing during phases of cuticle deposition.
On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions
NASA Astrophysics Data System (ADS)
Tarpine, Ryan; Istrail, Sorin
The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.
DNA sequence similarity recognition by hybridization to short oligomers
Milosavljevic, Aleksandar
1999-01-01
Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
(Methyl)ammonium Transport in the Nitrogen-Fixing Bacterium Azospirillum brasilense
Van Dommelen, Anne; Keijers, Veerle; Vanderleyden, Jos; de Zamaroczy, Miklos
1998-01-01
An ammonium transporter of Azospirillum brasilense was characterized. In contrast to most previously reported putative prokaryotic NH4+ transporter genes, A. brasilense amtB is not part of an operon with glnB or glnZ which, in A. brasilense, encode nitrogen regulatory proteins PII and PZ, respectively. Sequence analysis predicts the presence of 12 transmembrane domains in the deduced AmtB protein and classifies AmtB as an integral membrane protein. Nitrogen regulates the transcription of the amtB gene in A. brasilense by the Ntr system. amtB is the first gene identified in A. brasilense whose expression is regulated by NtrC. The observation that ammonium uptake is still possible in mutants lacking the AmtB protein suggests the presence of a second NH4+ transport mechanism. Growth of amtB mutants at low ammonium concentrations is reduced compared to that of the wild type. This suggests that AmtB has a role in scavenging ammonium at low concentrations. PMID:9573149
Puthoff, D P; Neelam, A; Ehrenfried, M L; Scheffler, B E; Ballard, L; Song, Q; Campbell, K B; Cooper, B; Tucker, M L
2008-10-01
Hyphae, 2 to 8 days postinoculation (dpi), and haustoria, 5 dpi, were isolated from Uromyces appendiculatus infected bean leaves (Phaseolus vulgaris cv. Pinto 111) and a separate cDNA library prepared for each fungal preparation. Approximately 10,000 hyphae and 2,700 haustoria clones were sequenced from both the 5' and 3' ends. Assembly of all of the fungal sequences yielded 3,359 contigs and 927 singletons. The U. appendiculatus sequences were compared with sequence data for other rust fungi, Phakopsora pachyrhizi, Uromyces fabae, and Puccinia graminis. The U. appendiculatus haustoria library included a large number of genes with unknown cellular function; however, summation of sequences of known cellular function suggested that haustoria at 5 dpi had fewer transcripts linked to protein synthesis in favor of energy metabolism and nutrient uptake. In addition, open reading frames in the U. appendiculatus data set with an N-terminal signal peptide were identified and compared with other proteins putatively secreted from rust fungi. In this regard, a small family of putatively secreted RTP1-like proteins was identified in U. appendiculatus and P. graminis.
Johnson, Timothy J; Siek, Kylie E; Johnson, Sara J; Nolan, Lisa K
2006-01-01
ColV plasmids have long been associated with the virulence of Escherichia coli, despite the fact that their namesake trait, ColV production, does not appear to contribute to virulence. Such plasmids or their associated sequences appear to be quite common among avian pathogenic E. coli (APEC) and are strongly linked to the virulence of these organisms. In the present study, a 180-kb ColV plasmid was sequenced and analyzed. This plasmid, pAPEC-O2-ColV, possesses a 93-kb region containing several putative virulence traits, including iss, tsh, and four putative iron acquisition and transport systems. The iron acquisition and transport systems include those encoding aerobactin and salmochelin, the sit ABC iron transport system, and a putative iron transport system novel to APEC, eit. In order to determine the prevalence of the virulence-associated genes within this region among avian E. coli strains, 595 APEC and 199 avian commensal E. coli isolates were examined for genes of this region using PCR. Results indicate that genes contained within a portion of this putative virulence region are highly conserved among APEC and that the genes of this region occur significantly more often in APEC than in avian commensal E. coli. The region of pAPEC-O2-ColV containing genes that are highly prevalent among APEC appears to be a distinguishing trait of APEC strains.
Johnson, Timothy J.; Siek, Kylie E.; Johnson, Sara J.; Nolan, Lisa K.
2006-01-01
ColV plasmids have long been associated with the virulence of Escherichia coli, despite the fact that their namesake trait, ColV production, does not appear to contribute to virulence. Such plasmids or their associated sequences appear to be quite common among avian pathogenic E. coli (APEC) and are strongly linked to the virulence of these organisms. In the present study, a 180-kb ColV plasmid was sequenced and analyzed. This plasmid, pAPEC-O2-ColV, possesses a 93-kb region containing several putative virulence traits, including iss, tsh, and four putative iron acquisition and transport systems. The iron acquisition and transport systems include those encoding aerobactin and salmochelin, the sit ABC iron transport system, and a putative iron transport system novel to APEC, eit. In order to determine the prevalence of the virulence-associated genes within this region among avian E. coli strains, 595 APEC and 199 avian commensal E. coli isolates were examined for genes of this region using PCR. Results indicate that genes contained within a portion of this putative virulence region are highly conserved among APEC and that the genes of this region occur significantly more often in APEC than in avian commensal E. coli. The region of pAPEC-O2-ColV containing genes that are highly prevalent among APEC appears to be a distinguishing trait of APEC strains. PMID:16385064
Enhancing gene regulatory network inference through data integration with markov random fields
Banf, Michael; Rhee, Seung Y.
2017-02-01
Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less
Enhancing gene regulatory network inference through data integration with markov random fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banf, Michael; Rhee, Seung Y.
Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less
Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice.
Smita, Shuchi; Katiyar, Amit; Chinnusamy, Viswanathan; Pandey, Dev M; Bansal, Kailash C
2015-01-01
MYB transcription factor (TF) is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by "top-down" and "guide-gene" approaches. More than 50% of OsMYBs were strongly correlated under 50 experimental conditions with 51 hub genes via "top-down" approach. Further, clusters were identified using Markov Clustering (MCL). To maximize the clustering performance, parameter evaluation of the MCL inflation score (I) was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by "guide-gene" approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought response in rice. Thus, the co-regulatory network analysis facilitated the identification of complex OsMYB regulatory networks, and candidate target regulon genes of selected guide MYB genes. The results contribute to the candidate gene screening, and experimentally testable hypotheses for potential regulatory MYB TFs, and their targets under stress conditions.
Sequetyping: Serotyping Streptococcus pneumoniae by a Single PCR Sequencing Strategy
Leung, Marcus H.; Bryson, Kevin; Freystatter, Kathrin; Pichon, Bruno; Edwards, Giles; Gillespie, Stephen H.
2012-01-01
The introduction of pneumococcal conjugate vaccines necessitates continued monitoring of circulating strains to assess vaccine efficacy and replacement serotypes. Conventional serological methods are costly, labor-intensive, and prone to misidentification, while current DNA-based methods have limited serotype coverage requiring multiple PCR primers. In this study, a computer algorithm was developed to interrogate the capsulation locus (cps) of vaccine serotypes to locate primer pairs in conserved regions that border variable regions and could differentiate between serotypes. In silico analysis of cps from 92 serotypes indicated that a primer pair spanning the regulatory gene cpsB could putatively amplify 84 serotypes and differentiate 46. This primer set was specific to Streptococcus pneumoniae, with no amplification observed for other species, including S. mitis, S. oralis, and S. pseudopneumoniae. One hundred thirty-eight pneumococcal strains covering 48 serotypes were tested. Of 23 vaccine serotypes included in the study, most (19/22, 86%) were identified correctly at least to the serogroup level, including all of the 13-valent conjugate vaccine and other replacement serotypes. Reproducibility was demonstrated by the correct sequetyping of different strains of a serotype. This novel sequence-based method employing a single PCR primer pair is cost-effective and simple. Furthermore, it has the potential to identify new serotypes that may evolve in the future. PMID:22553238
GBshape: a genome browser database for DNA shape annotations.
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J; Parker, Stephen C J; Nuzhdin, Sergey V; Tullius, Thomas D; Rohs, Remo
2015-01-01
Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Aubin, Guillaume Ghislain; Lavigne, Jean-Philippe; Foucher, Yohan; Dellière, Sarah; Lepelletier, Didier; Gouin, François; Corvec, Stéphane
2017-10-01
The recognition of the pathogenicity of Cutibacterium acnes in implant-associated infection is not always obvious. In this paper, we aimed to distinguish pathogenic and non-pathogenic C. acnes isolates. To reach this goal, we investigated the clonal complex (CC) of a large collection of C. acnes clinical isolates through Multi-Locus Sequence Typing (MLST), we established a Caenorhabditis elegans model to assess C. acnes virulence and we investigated the presence of virulence factors in our collection. Ours results showed that CC36 and CC53 C. acnes isolates were more frequently observed in prosthetic joint infections (PJI) than CC18 and CC28 C. acnes isolates (p = 0.021). The C. elegans model developed here showed two distinct virulence groups of C. acnes (p < 0.05). These groups were not correlated to CC or clinical origin. Whole genome sequencing allowed us to identify a putative gene linked to low virulent strains. In conclusion, MLST remains a good method to screen pathogenic C. acnes isolates according to their clinical context but mechanisms of C. acnes virulence need to be assess thought transcriptomic analysis to investigate regulatory process. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization of the reniform nematode genome by shotgun sequencing.
Nyaku, Seloame T; Sripathi, Venkateswara R; Kantety, Ramesh V; Cseke, Sarah B; Buyyarapu, Ramesh; Mc Ewan, Robert; Gu, Yong Q; Lawrence, Kathy; Senwo, Zachary; Sripathi, Padmini; George, Pheba; Sharma, Govind C
2014-04-01
The reniform nematode (RN), a major agricultural pest particularly on cotton in the United States, is among the major plant-parasitic nematodes for which limited genomic information exists. In this study, over 380 Mb of sequence data were generated from pooled DNA of four adult female RNs and assembled into 67,317 contigs, including 25,904 (38.5%) predicted coding contigs and 41,413 (61.5%) noncoding contigs. Most of the characterized repeats were of low complexity (88.9%), and 0.9% of the contigs matched with 53.2% of GenBank ESTs. The most frequent Gene Ontology (GO) terms for molecular function and biological process were protein binding (32%) and embryonic development (20%). Further analysis showed that 741 (1.1%), 94 (0.1%), and 169 (0.25%) RN genomic contigs matched with 1328 (13.9%), 1480 (5.4%), and 1330 (7.4%) supercontigs of Meloidogyne incognita, Brugia malayi, and Pristionchus pacificus, respectively. Chromosome 5 of Caenorhabditis elegans had the highest number of hits to the RN contigs. Seven putative detoxification genes and three carbohydrate-active enzymes (CAZymes) involved in cell wall degradation were studied in more detail. Additionally, kinases, G protein-coupled receptors, and neuropeptides functioning in physiological, developmental, and regulatory processes were identified in the RN genome.
Hendrix, Roger W.; Dedrick, Rebekah; Mitchell, Kaitlin; Ko, Ching-Chung; Russell, Daniel; Bell, Emma; Gregory, Matthew; Bibb, Maureen J.; Pethick, Florence; Jacobs-Sera, Deborah; Herron, Paul; Buttner, Mark J.; Hatfull, Graham F.
2013-01-01
The genome sequences of eight Streptomyces phages are presented, four of which were isolated for this study. Phages R4, TG1, ϕHau3, and SV1 were isolated previously and have been exploited as tools for understanding and genetically manipulating Streptomyces spp. We also extracted five apparently intact prophages from recent Streptomyces spp. genome projects and, together with six phage genomes in the database, we analyzed all 19 Streptomyces phage genomes with a view to understanding their relationships to each other and to other actinophages, particularly the mycobacteriophages. Fifteen of the Streptomyces phages group into four clusters of related genomes. Although the R4-like phages do not share nucleotide sequence similarity with other phages, they clearly have common ancestry with cluster A mycobacteriophages, sharing many protein homologues, common gene syntenies, and similar repressor-stoperator regulatory systems. The R4-like phage ϕHau3 and the prophage StrepC.1 (from Streptomyces sp. strain C) appear to have hijacked a unique adaptation of the streptomycetes, i.e., use of the rare UUA codon, to control translation of the essential phage protein, the terminase. The Streptomyces venezuelae generalized transducing phage SV1 was used to predict the presence of other generalized transducing phages for different Streptomyces species. PMID:23995638
Identification of G-quadruplex forming sequences in three manatee papillomaviruses
Zahin, Maryam; Dean, William L.; Ghim, Shin-je; Joh, Joongho; Gray, Robert D.; Khanal, Sujita; Bossart, Gregory D.; Mignucci-Giannoni, Antonio A.; Rouchka, Eric C.; Jenson, Alfred B.; Trent, John O.; Chaires, Jonathan B.
2018-01-01
The Florida manatee (Trichechus manatus latirotris) is a threatened aquatic mammal in United States coastal waters. Over the past decade, the appearance of papillomavirus-induced lesions and viral papillomatosis in manatees has been a concern for those involved in the management and rehabilitation of this species. To date, three manatee papillomaviruses (TmPVs) have been identified in Florida manatees, one forming cutaneous lesions (TmPV1) and two forming genital lesions (TmPV3 and TmPV4). We identified DNA sequences with the potential to form G-quadruplex structures (G4) across the three genomes. G4 were located on both DNA strands and across coding and non-coding regions on all TmPVs, offering multiple targets for viral control. Although G4 have been identified in several viral genomes, including human PVs, most research has focused on canonical structures comprised of three G-tetrads. In contrast, the vast majority of sequences we identified would allow the formation of non-canonical structures with only two G-tetrads. Our biophysical analysis confirmed the formation of G4 with parallel topology in three such sequences from the E2 region. Two of the structures appear comprised of multiple stacked two G-tetrad structures, perhaps serving to increase structural stability. Computational analysis demonstrated enrichment of G4 sequences on all TmPVs on the reverse strand in the E2/E4 region and on both strands in the L2 region. Several G4 sequences occurred at similar regional locations on all PVs, most notably on the reverse strand in the E2 region. In other cases, G4 were identified at similar regional locations only on PVs forming genital lesions. On all TmPVs, G4 sequences were located in the non-coding region near putative E2 binding sites. Together, these findings suggest that G4 are possible regulatory elements in TmPVs. PMID:29630682
Ficarelli, A; Tassi, F; Restivo, F M
1999-03-01
We have isolated two full length cDNA clones encoding Nicotiana plumbaginifolia NADH-glutamate dehydrogenase. Both clones share amino acid boxes of homology corresponding to conserved GDH catalytic domains and putative mitochondrial targeting sequence. One clone shows a putative EF-hand loop. The level of the two transcripts is affected differently by carbon source.
PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation
Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W
2007-01-01
PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232
PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation.
Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W
2007-01-01
PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at http://www.pazar.info, is open for business.
Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
Tharakaraman, Kannan; Mariño-Ramírez, Leonardo; Sheetlin, Sergey L; Landsman, David; Spouge, John L
2006-01-01
Background Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. Results We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. Conclusion Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances. PMID:16961919
Role of sequence encoded κB DNA geometry in gene regulation by Dorsal
Mrinal, Nirotpal; Tomar, Archana; Nagaraju, Javaregowda
2011-01-01
Many proteins of the Rel family can act as both transcriptional activators and repressors. However, mechanism that discerns the ‘activator/repressor’ functions of Rel-proteins such as Dorsal (Drosophila homologue of mammalian NFκB) is not understood. Using genomic, biophysical and biochemical approaches, we demonstrate that the underlying principle of this functional specificity lies in the ‘sequence-encoded structure’ of the κB-DNA. We show that Dorsal-binding motifs exist in distinct activator and repressor conformations. Molecular dynamics of DNA-Dorsal complexes revealed that repressor κB-motifs typically have A-tract and flexible conformation that facilitates interaction with co-repressors. Deformable structure of repressor motifs, is due to changes in the hydrogen bonding in A:T pair in the ‘A-tract’ core. The sixth nucleotide in the nonameric κB-motif, ‘A’ (A6) in the repressor motifs and ‘T’ (T6) in the activator motifs, is critical to confer this functional specificity as A6 → T6 mutation transformed flexible repressor conformation into a rigid activator conformation. These results highlight that ‘sequence encoded κB DNA-geometry’ regulates gene expression by exerting allosteric effect on binding of Rel proteins which in turn regulates interaction with co-regulators. Further, we identified and characterized putative repressor motifs in Dl-target genes, which can potentially aid in functional annotation of Dorsal gene regulatory network. PMID:21890896
Nadjar-Boger, Elisabeth; Maccatrozzo, Lisa; Radaelli, Giuseppe; Funkenstein, Bruria
2013-02-01
Myostatin (MSTN) is a member of the transforming growth factor-ß superfamily, known as a negative regulator of skeletal muscle development and growth in mammals. In contrast to mammals, fish possess at least two paralogs of MSTN: MSTN-1 and MSTN-2. Here we describe the cloning and sequence analysis of spliced and precursor (unspliced) transcripts as well as the 5' flanking region of MSTN-2 from the marine fish Umbrina cirrosa (ucMSTN-2). In silico analysis revealed numerous putative cis regulatory elements including several E-boxes known as binding sites to myogenic transcription factors. Transient transfection experiments using non-muscle and muscle cell lines showed high transcriptional activity in muscle cells and in differentiated neural cells, in accordance with our previous findings in MSTN-2 promoter from Sparus aurata. Comparative informatics analysis of MSTN-2 from several fish species revealed high conservation of the predicted amino acid sequence as well as the gene structure (exon length) although intron length varied between species. The proximal promoter of MSTN-2 gene was found to be conserved among Perciforms. In conclusion, this study reinforces our conclusion that MSTN-2 promoter is a very strong promoter, especially in muscle cells. In addition, we show that the MSTN-2 gene structure is highly conserved among fishes as is the predicted amino acid sequence of the peptide. Copyright © 2012 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bell-Lelong, D.A.; Cusumano, J.C.; Meyer, K.
1997-03-01
Cinnamate-r-hydroxylase (C4H) is the first Cyt P450-dependent monooxygenase of the phenylpropanoid pathway. To study the expression of this gene in Arabidopsis thaliana, a C4H cDNA clone from the Arabidopsis expressed sequence tag database was identified and used to isolate its corresponding genomic clone. The entire C4H coding sequence plus 2.9 kb of its promoter were isolated on a 5.4-kb HindIII fragment of this cosmid. Inspection of the promoter sequence revealed the presence of a number of putative regulatory motifs previously identified in the promoters of other phenylpropanoid pathway genes. The expression of C4H was analyzed by RNA blot hybridization analysismore » and in transgenic Arabidopsis carrying a C4H-{beta}-glucuronidase transcriptional fusion. C4H message accumulation was light-dependent, but was detectable even in dark-grown seedlings. Consistent with these data, C4H mRNA was accumulated to light-grown levels in etiolated det1-1 mutant seedlings. C4H is widely expressed in various Arabidopsis tissues, particularly in roots and cells undergoing lignification. The C4H-driven {beta}-glucuronidase expression accurately reflected the tissue-specificity and wound-inducibility of the C4H promoter indicated by RNA blot hybridization analysis. A modest increase in C4H expression was observed in the tt8 mutant of Arabidopsis. 77 refs., 5 figs.« less
Negre, Bárbara; Casillas, Sònia; Suzanne, Magali; Sánchez-Herrero, Ernesto; Akam, Michael; Nefedov, Michael; Barbadilla, Antonio; de Jong, Pieter; Ruiz, Alfredo
2005-01-01
Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity. PMID:15867430
González-Peñas, Javier; Arrojo, Manuel; Paz, Eduardo; Brenlla, Julio; Páramo, Mario; Costas, Javier
2015-10-01
Schizophrenia may be considered a human-specific disorder arisen as a maladaptive by-product of human-specific brain evolution. Therefore, genetic variants involved in susceptibility to schizophrenia may be identified among those genes related to acquisition of human-specific traits. NPAS3, a transcription factor involved in central nervous system development and neurogenesis, seems to be implicated in the evolution of human brain, as it is the human gene with most human-specific accelerated elements (HAEs), i.e., .mammalian conserved regulatory sequences with accelerated evolution in the lineage leading to humans after human-chimpanzee split. We hypothesize that any nucleotide variant at the NPAS3 HAEs may lead to altered susceptibility to schizophrenia. Twenty-one variants at these HAEs detected by the 1000 genomes Project, as well as five additional variants taken from psychiatric genome-wide association studies, were genotyped in 538 schizophrenic patients and 539 controls from Galicia. Analyses at the haplotype level or based on the cumulative role of the variants assuming different susceptibility models did not find any significant association in spite of enough power under several plausible scenarios regarding direction of effect and the specific role of rare and common variants. These results suggest that, contrary to our hypothesis, the special evolution of the NPAS3 HAEs in Homo relaxed the strong constraint on sequence that characterized these regions during mammalian evolution, allowing some sequence changes without any effect on schizophrenia risk. © 2015 Wiley Periodicals, Inc.
Global characterization of Artemisia annua glandular trichome transcriptome using 454 pyrosequencing
Wang, Wei; Wang, Yejun; Zhang, Qing; Qi, Yan; Guo, Dianjing
2009-01-01
Background Glandular trichomes produce a wide variety of commercially important secondary metabolites in many plant species. The most prominent anti-malarial drug artemisinin, a sesquiterpene lactone, is produced in glandular trichomes of Artemisia annua. However, only limited genomic information is currently available in this non-model plant species. Results We present a global characterization of A. annua glandular trichome transcriptome using 454 pyrosequencing. Sequencing runs using two normalized cDNA collections from glandular trichomes yielded 406,044 expressed sequence tags (average length = 210 nucleotides), which assembled into 42,678 contigs and 147,699 singletons. Performing a second sequencing run only increased the number of genes identified by ~30%, indicating that massively parallel pyrosequencing provides deep coverage of the A. annua trichome transcriptome. By BLAST search against the NCBI non-redundant protein database, putative functions were assigned to over 28,573 unigenes, including previously undescribed enzymes likely involved in sesquiterpene biosynthesis. Comparison with ESTs derived from trichome collections of other plant species revealed expressed genes in common functional categories across different plant species. RT-PCR analysis confirmed the expression of selected unigenes and novel transcripts in A. annua glandular trichomes. Conclusion The presence of contigs corresponding to enzymes for terpenoids and flavonoids biosynthesis suggests important metabolic activity in A. annua glandular trichomes. Our comprehensive survey of genes expressed in glandular trichome will facilitate new gene discovery and shed light on the regulatory mechanism of artemisinin metabolism and trichome function in A. annua. PMID:19818120
IRREGULAR POLLEN EXINE1 Is a Novel Factor in Anther Cuticle and Pollen Exine Formation1[OPEN
Chen, Xiaoyang; Zhang, Hua; Luo, Hongbing; Zhao, Li; Dong, Zhaobin; Yan, Shuangshuang; Liu, Renyi; Xu, Chunyan; Li, Song; Chen, Huabang
2017-01-01
Anther cuticle and pollen exine are protective barriers for pollen development and fertilization. Despite that several regulators have been identified for anther cuticle and pollen exine development in rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana), few genes have been characterized in maize (Zea mays) and the underlying regulatory mechanism remains elusive. Here, we report a novel male-sterile mutant in maize, irregular pollen exine1 (ipe1), which exhibited a glossy outer anther surface, abnormal Ubisch bodies, and defective pollen exine. Using map-based cloning, the IPE1 gene was isolated as a putative glucose-methanol-choline oxidoreductase targeted to the endoplasmic reticulum. Transcripts of IPE1 were preferentially accumulated in the tapetum during the tetrad and early uninucleate microspore stage. A biochemical assay indicated that ipe1 anthers had altered constituents of wax and a significant reduction of cutin monomers and fatty acids. RNA sequencing data revealed that genes implicated in wax and flavonoid metabolism, fatty acid synthesis, and elongation were differentially expressed in ipe1 mutant anthers. In addition, the analysis of transfer DNA insertional lines of the orthologous gene in Arabidopsis suggested that IPE1 and their orthologs have a partially conserved function in male organ development. Our results showed that IPE1 participates in the putative oxidative pathway of C16/C18 ω-hydroxy fatty acids and controls anther cuticle and pollen exine development together with MALE STERILITY26 and MALE STERILITY45 in maize. PMID:28049856
Assessment of the Requirements for Magnesium Transporters in Bacillus subtilis
Wakeman, Catherine A.; Goodson, Jonathan R.; Zacharia, Vineetha M.
2014-01-01
Magnesium is the most abundant divalent metal in cells and is required for many structural and enzymatic functions. For bacteria, at least three families of proteins function as magnesium transporters. In recent years, it has been shown that a subset of these transport proteins is regulated by magnesium-responsive genetic control elements. In this study, we investigated the cellular requirements for magnesium homeostasis in the model microorganism Bacillus subtilis. Putative magnesium transporter genes were mutationally disrupted, singly and in combination, in order to assess their general importance. Mutation of only one of these genes resulted in strong dependency on supplemental extracellular magnesium. Notably, this transporter gene, mgtE, is known to be under magnesium-responsive genetic regulatory control. This suggests that the identification of magnesium-responsive genetic mechanisms may generally denote primary transport proteins for bacteria. To investigate whether B. subtilis encodes yet additional classes of transport mechanisms, suppressor strains that permitted the growth of a transporter-defective mutant were identified. Several of these strains were sequenced to determine the genetic basis of the suppressor phenotypes. None of these mutations occurred in transport protein homologues; instead, they affected housekeeping functions, such as signal recognition particle components and ATP synthase machinery. From these aggregate data, we speculate that the mgtE protein provides the primary route of magnesium import in B. subtilis and that the other putative transport proteins are likely to be utilized for more-specialized growth conditions. PMID:24415722
Haase, B; Jude, R; Brooks, S A; Leeb, T
2008-06-01
The tobiano white-spotting pattern is one of several known depigmentation phenotypes in horses and is desired by many horse breeders and owners. The tobiano spotting phenotype is inherited as an autosomal dominant trait. Horses that are heterozygous or homozygous for the tobiano allele (To) are phenotypically indistinguishable. A SNP associated with To had previously been identified in intron 13 of the equine KIT gene and was used for an indirect gene test. The test was useful in several horse breeds. However, genotyping this sequence variant in the Lewitzer horse breed revealed that 14% of horses with the tobiano pattern did not show the polymorphism in intron 13 and consequently the test was not useful to identify putative homozygotes for To within this breed. Speculations were raised that an independent mutation might cause the tobiano spotting pattern in this breed. Recently, the putative causative mutation for To was described as a large chromosomal inversion on equine chromosome 3. One of the inversion breakpoints is approximately 70 kb downstream of the KIT gene and probably disrupts a regulatory element of the KIT gene. We obtained genotypes for the intron 13 SNP and the chromosomal inversion for 204 tobiano spotted horses and 24 control animals of several breeds. The genotyping data confirmed that the chromosomal inversion was perfectly associated with the To allele in all investigated horses. Therefore, the new test is suitable to discriminate heterozygous To/+ and homozygous To/To horses in the investigated breeds.
Umasuthan, Navaneethaiyer; Bathige, S D N K; Whang, Ilson; Lim, Bong-Soo; Choi, Cheol Young; Lee, Jehee
2015-04-01
As a pivotal signaling mediator of toll-like receptor (TLR) and interleukin (IL)-1 receptor (IL-1R) signaling cascades, the IL-1R-associated kinase 4 (IRAK4) is engaged in the activation of host immunity. This study investigates the molecular and expressional profiles of an IRAK4-like homolog from Oplegnathus fasciatus (OfIRAK4). The OfIRAK4 gene (8.2 kb) was structured with eleven exons and ten introns. A putative coding sequence (1395bp) was translated to the OfIRAK protein of 464 amino acids. The deduced OfIRAK4 protein featured a bipartite domain structure composed of a death domain (DD) and a kinase domain (PKc). Teleost IRAK4 appears to be distinct and divergent from that of tetrapods in terms of its exon-intron structure and evolutionary relatedness. Analysis of the sequence upstream of translation initiation site revealed the presence of putative regulatory elements, including NF-κB-binding sites, which are possibly involved in transcriptional control of OfIRAK4. Quantitative real-time PCR (qPCR) was employed to assess the transcriptional expression of OfIRAK4 in different juvenile tissues and post-injection of different immunogens and pathogens. Ubiquitous basal mRNA expression was widely detected with highest level in liver. In vivo flagellin (FLA) challenge significantly intensified its mRNA levels in intestine, liver and head kidney indicating its role in FLA-induced signaling. Meanwhile, up-regulated expression was also determined in liver and head kidney of animals challenged with potent immunogens (LPS and poly I:C) and pathogens (Edwardsiella tarda and Streptococcus iniae and rock bream iridovirus (RBIV)). Taken together, these data implicate that OfIRAK4 might be engaged in antibacterial and antiviral immunity in rock bream. Copyright © 2014 Elsevier Ltd. All rights reserved.
Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav
2013-07-18
Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.
2012-01-01
Background Natrialba magadii is an aerobic chemoorganotrophic member of the Euryarchaeota and is a dual extremophile requiring alkaline conditions and hypersalinity for optimal growth. The genome sequence of Nab. magadii type strain ATCC 43099 was deciphered to obtain a comprehensive insight into the genetic content of this haloarchaeon and to understand the basis of some of the cellular functions necessary for its survival. Results The genome of Nab. magadii consists of four replicons with a total sequence of 4,443,643 bp and encodes 4,212 putative proteins, some of which contain peptide repeats of various lengths. Comparative genome analyses facilitated the identification of genes encoding putative proteins involved in adaptation to hypersalinity, stress response, glycosylation, and polysaccharide biosynthesis. A proton-driven ATP synthase and a variety of putative cytochromes and other proteins supporting aerobic respiration and electron transfer were encoded by one or more of Nab. magadii replicons. The genome encodes a number of putative proteases/peptidases as well as protein secretion functions. Genes encoding putative transcriptional regulators, basal transcription factors, signal perception/transduction proteins, and chemotaxis/phototaxis proteins were abundant in the genome. Pathways for the biosynthesis of thiamine, riboflavin, heme, cobalamin, coenzyme F420 and other essential co-factors were deduced by in depth sequence analyses. However, approximately 36% of Nab. magadii protein coding genes could not be assigned a function based on Blast analysis and have been annotated as encoding hypothetical or conserved hypothetical proteins. Furthermore, despite extensive comparative genomic analyses, genes necessary for survival in alkaline conditions could not be identified in Nab. magadii. Conclusions Based on genomic analyses, Nab. magadii is predicted to be metabolically versatile and it could use different carbon and energy sources to sustain growth. Nab. magadii has the genetic potential to adapt to its milieu by intracellular accumulation of inorganic cations and/or neutral organic compounds. The identification of Nab. magadii genes involved in coenzyme biosynthesis is a necessary step toward further reconstruction of the metabolic pathways in halophilic archaea and other extremophiles. The knowledge gained from the genome sequence of this haloalkaliphilic archaeon is highly valuable in advancing the applications of extremophiles and their enzymes. PMID:22559199
An insight into the sialotranscriptome of the seed-feeding bug, Oncopeltus fasciatus.
Francischetti, Ivo M B; Lopes, Angela H; Dias, Felipe A; Pham, Van M; Ribeiro, José M C
2007-09-01
The salivary transcriptome of the seed-feeding hemipteran, Oncopeltus fasciatus (milkweed bug), is described following assembly of 1025 expressed sequence tags (ESTs) into 305 clusters of related sequences. Inspection of these sequences reveals abundance of low complexity, putative secreted products rich in the amino acids (aa) glycine, serine or threonine, which might function as silk or mucins and assist food canal lubrication and sealing of the feeding site around the mouthparts. Several protease inhibitors were found, including abundant expression of cystatin transcripts that may inhibit cysteine proteases common in seeds that might injure the insect or induce plant apoptosis. Serine proteases and lipases are described that might assist digestion and liquefaction of seed proteins and oils. Finally, several novel putative proteins are described with no known function that might affect plant physiology or act as antimicrobials.
Liu, Ju; Li, Ruihua; Liu, Kun; Li, Liangliang; Zai, Xiaodong; Chi, Xiangyang; Fu, Ling; Xu, Junjie; Chen, Wei
2016-04-22
High-throughput sequencing of the antibody repertoire provides a large number of antibody variable region sequences that can be used to generate human monoclonal antibodies. However, current screening methods for identifying antigen-specific antibodies are inefficient. In the present study, we developed an antibody clone screening strategy based on clone dynamics and relative frequency, and used it to identify antigen-specific human monoclonal antibodies. Enzyme-linked immunosorbent assay showed that at least 52% of putative positive immunoglobulin heavy chains composed antigen-specific antibodies. Combining information on dynamics and relative frequency improved identification of positive clones and elimination of negative clones. and increase the credibility of putative positive clones. Therefore the screening strategy could simplify the subsequent experimental screening and may facilitate the generation of antigen-specific antibodies. Copyright © 2016 Elsevier Inc. All rights reserved.
An, Z; Tang, Z; Ma, B; Mason, A S; Guo, Y; Yin, J; Gao, C; Wei, L; Li, J; Fu, D
2014-07-01
Although many studies have shown that transposable element (TE) activation is induced by hybridisation and polyploidisation in plants, much less is known on how different types of TE respond to hybridisation, and the impact of TE-associated sequences on gene function. We investigated the frequency and regularity of putative transposon activation for different types of TE, and determined the impact of TE-associated sequence variation on the genome during allopolyploidisation. We designed different types of TE primers and adopted the Inter-Retrotransposon Amplified Polymorphism (IRAP) method to detect variation in TE-associated sequences during the process of allopolyploidisation between Brassica rapa (AA) and Brassica oleracea (CC), and in successive generations of self-pollinated progeny. In addition, fragments with TE insertions were used to perform Blast2GO analysis to characterise the putative functions of the fragments with TE insertions. Ninety-two primers amplifying 548 loci were used to detect variation in sequences associated with four different orders of TE sequences. TEs could be classed in ascending frequency into LTR-REs, TIRs, LINEs, SINEs and unknown TEs. The frequency of novel variation (putative activation) detected for the four orders of TEs was highest from the F1 to F2 generations, and lowest from the F2 to F3 generations. Functional annotation of sequences with TE insertions showed that genes with TE insertions were mainly involved in metabolic processes and binding, and preferentially functioned in organelles. TE variation in our study severely disturbed the genetic compositions of the different generations, resulting in inconsistencies in genetic clustering. Different types of TE showed different patterns of variation during the process of allopolyploidisation. © 2013 German Botanical Society and The Royal Botanical Society of the Netherlands.
Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis
2014-01-01
Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. PMID:25078912
Identification and analysis of pig chimeric mRNAs using RNA sequencing data
2012-01-01
Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Povinelli, C M
1992-01-01
In order to detect sequence-based information predictive for the location of eukaryotic transcriptional regulatory domains, the frequencies and distributions of the 36 possible purine/pyrimidine reverse complement hexamer pairs was determined for test sets of real and random sequences. The distribution of one of the hexamer pairs (RRYYRR/YYRRYY, referred to as M1) was further examined in a larger set of sequences (> 32 genes, 230 kb). Predominant clusters of M1 and the locations of eukaryotic transcriptional regulatory domains were found to be associated and non-randomly distributed along the DNA consistent with a periodicity of approximately 1.2 kb. In the context of higher ordered chromatin this would align promoters, enhancers and the predominant clusters of M1 longitudinally along one face of a 30 nm fiber. Using only information about the distribution of the M1 motif, 50-70% of a sequence could be eliminated as being unlikely to contain transcriptional regulatory domains with an 87% recovery of the regulatory domains present.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Functional Analysis of Promoter Region from Eel Cytochrome P450 1A1 Gene in Transgenic Medaka.
Ogino; Itakura; Kato; Aoki; Sato
1999-07-01
: Transcription of the CYP1A1 genes in mammals and fish is stimulated by polyaromatic hydrocarbons. DNA sequencing analysis revealed that CYP1A1 gene in eel (Anguilla japonica) contains two kinds of putative cis-acting regulatory elements, XRE (xenobiotic-responsive element) and ERE (estrogen-responsive element). XRE is known as the enhancer that is responsible for the inducibility of the genes of CYP1A1 and some other drug-metabolizing enzymes. In the eel CYP1A1 gene, XRE motifs are distributed as follows: five times in the region from -2136 to -1125 bp, XRE(-6) to (-2); once in the proximal basal promoter region, XRE(-1); and once in the first intron, XRE(+1). The region between XRE(-2) and XRE(-1) contains three ERE motifs. To investigate the function of the cis-acting regulatory elements in the eel CYP1A1 gene, recombinant plasmids prepared with its 5' upstream sequence and the structural gene for luciferase were microinjected into fertilized eggs of medaka at the one-cell stage. Hatched fry were treated with 3-methylcholanthrene, and the transcription efficiency was assayed using competitive polymerase chain reaction analysis. Deletion of the region containing the five XREs, XRE(-6) to XRE(-2), and the point mutation of XRE(-1) reduced the inducible expressions by 75% and 56%, respectively, showing apparent dependency of the drug induction on the XREs. Constitutive expression, however, was not significantly affected by deletion or disruption of the XREs. When the region between XRE(-2) and XRE(-1) containing no XREs but three ERE motifs was internally deleted, the inducible expression and the constitutive expression were reduced by 88% and 75%, respectively. Replacement of this region with a partial fragment of eel CYP1A1 complementary DNA, with slight alteration of the distance between the five XREs and XRE(-1), reduced the inducible expression and the constitutive expression by 91% and 60%, respectively. These results strongly suggest that not only XRE but also other regulatory elements, possibly ERE, play an important role in induced and constitutive expressions of the eel CYP1A1 gene.
Loohuis, Nikkie FM Olde; Kasri, Nael Nadif; Glennon, Jeffrey C; van Bokhoven, Hans; Hébert, Sébastien S; Kaplan, Barry B.; Martens, Gerard JM; Aschrafi, Armaz
2016-01-01
MicroRNAs (miRs) are small regulatory molecules, which orchestrate neuronal development and plasticity through modulation of complex gene networks. microRNA-137 (miR-137) is a brain-enriched RNA with a critical role in regulating brain development and in mediating synaptic plasticity. Importantly, mutations in this miR are associated with the pathoetiology of schizophrenia (SZ), and there is a widespread assumption that disruptions in miR-137 expression lead to aberrant expression of gene regulatory networks associated with SZ. To systematically identify the mRNA targets for this miR, we performed miR-137 gain- and loss-of-function experiments in primary rat hippocampal neurons and profiled differentially expressed mRNAs through next-generation sequencing. We identified 500 genes that were bidirectionally activated or repressed in their expression by the modulation of miR-137 levels. Gene ontology analysis using two independent software resources suggested functions for these miR-137-regulated genes in neurodevelopmental processes, neuronal maturation processes and cell maintenance, all of which known to be critical for proper brain circuitry formation. Since many of the putative miR-137 targets identified here also have been previously shown to be associated with SZ, we propose that this miR acts as a critical gene network hub contributing to the pathophysiology of this neurodevelopmental disorder. PMID:26925706
Allen, Michael S.; Hurst, Gregory B.; Lu, Tse-Yuan S.; ...
2015-04-08
Rhodopseudomonas palustris encodes 16 extracytoplasmic function (ECF) σ factors. In this paper, to begin to investigate the regulatory network of one of these ECF σ factors, the whole proteome of R. palustris CGA010 was quantitatively analyzed by tandem mass spectrometry from cultures episomally expressing the ECF σ RPA4225 (ecfT) versus a WT control. Among the proteins with the greatest increase in abundance were catalase KatE, trehalose synthase, a DPS-like protein, and several regulatory proteins. Alignment of the cognate promoter regions driving expression of several upregulated proteins suggested a conserved binding motif in the -35 and -10 regions with the consensusmore » sequence GGAAC-18N-TT. Additionally, the putative anti-σ factor RPA4224, whose gene is contained in the same predicted operon as RPA4225, was identified as interacting directly with the predicted response regulator RPA4223 by mass spectrometry of affinity-isolated protein complexes. Furthermore, another gene (RPA4226) coding for a protein that contains a cytoplasmic histidine kinase domain is located immediately upstream of RPA4225. The genomic organization of orthologs for these four genes is conserved in several other strains of R. palustris as well as in closely related α-Proteobacteria. Finally, taken together, these data suggest that ECF σ RPA4225 and the three additional genes make up a sigma factor mimicry system in R. palustris.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allen, Michael S.; Hurst, Gregory B.; Lu, Tse-Yuan S.
Rhodopseudomonas palustris encodes 16 extracytoplasmic function (ECF) σ factors. In this paper, to begin to investigate the regulatory network of one of these ECF σ factors, the whole proteome of R. palustris CGA010 was quantitatively analyzed by tandem mass spectrometry from cultures episomally expressing the ECF σ RPA4225 (ecfT) versus a WT control. Among the proteins with the greatest increase in abundance were catalase KatE, trehalose synthase, a DPS-like protein, and several regulatory proteins. Alignment of the cognate promoter regions driving expression of several upregulated proteins suggested a conserved binding motif in the -35 and -10 regions with the consensusmore » sequence GGAAC-18N-TT. Additionally, the putative anti-σ factor RPA4224, whose gene is contained in the same predicted operon as RPA4225, was identified as interacting directly with the predicted response regulator RPA4223 by mass spectrometry of affinity-isolated protein complexes. Furthermore, another gene (RPA4226) coding for a protein that contains a cytoplasmic histidine kinase domain is located immediately upstream of RPA4225. The genomic organization of orthologs for these four genes is conserved in several other strains of R. palustris as well as in closely related α-Proteobacteria. Finally, taken together, these data suggest that ECF σ RPA4225 and the three additional genes make up a sigma factor mimicry system in R. palustris.« less
Fernandes, Neil; Case, Rebecca J.; Longford, Sharon R.; Seyedsayamdost, Mohammad R.; Steinberg, Peter D.; Kjelleberg, Staffan; Thomas, Torsten
2011-01-01
Nautella sp. R11, a member of the marine Roseobacter clade, causes a bleaching disease in the temperate-marine red macroalga, Delisea pulchra. To begin to elucidate the molecular mechanisms underpinning the ability of Nautella sp. R11 to colonize, invade and induce bleaching of D. pulchra, we sequenced and analyzed its genome. The genome encodes several factors such as adhesion mechanisms, systems for the transport of algal metabolites, enzymes that confer resistance to oxidative stress, cytolysins, and global regulatory mechanisms that may allow for the switch of Nautella sp. R11 to a pathogenic lifestyle. Many virulence effectors common in phytopathogenic bacteria are also found in the R11 genome, such as the plant hormone indole acetic acid, cellulose fibrils, succinoglycan and nodulation protein L. Comparative genomics with non-pathogenic Roseobacter strains and a newly identified pathogen, Phaeobacter sp. LSS9, revealed a patchy distribution of putative virulence factors in all genomes, but also led to the identification of a quorum sensing (QS) dependent transcriptional regulator that was unique to pathogenic Roseobacter strains. This observation supports the model that a combination of virulence factors and QS-dependent regulatory mechanisms enables indigenous members of the host alga's epiphytic microbial community to switch to a pathogenic lifestyle, especially under environmental conditions when innate host defence mechanisms are compromised. PMID:22162749
Identification and characterization of cell-specific enhancer elements for the mouse ETF/Tead2 gene.
Tanoue, Y; Yasunami, M; Suzuki, K; Ohkubo, H
2001-12-21
We have identified and characterized by transient transfection assays the cell-specific 117-bp enhancer sequence in the first intron of the mouse ETF (Embryonic TEA domain-containing factor)/Tead2 gene required for transcriptional activation in ETF/Tead2 gene-expressing cells, such as P19 cells. The 117-bp enhancer contains one GC-rich sequence (5'-GGGGCGGGG-3'), termed the GC box, and two tandemly repeated GA-rich sequences (5'-GGGGGAGGGG-3'), termed the proximal and distal GA elements. Further analyses, including transfection studies and electrophoretic mobility shift assays using a series of deletion and mutation constructs, indicated that Sp1, a putative activator, may be required to predominate over its competition with another unknown putative repressor, termed the GA element-binding factor, for binding to both the GC box, which overlapped with the proximal GA element, and the distal GA element in the 117-bp sequence in order to achieve a full enhancer activity. We also discuss a possible mechanism underlying the cell-specific enhancer activity of the 117-bp sequence.
Khanna, Namita; Ghosh, Ananta Kumar; Huntemann, Marcel; Deshpande, Shweta; Han, James; Chen, Amy; Kyrpides, Nikos; Mavrommatis, Kostas; Szeto, Ernest; Markowitz, Victor; Ivanova, Natalia; Pagani, Ioanna; Pati, Amrita; Pitluck, Sam; Nolan, Matt; Woyke, Tanja; Teshima, Hazuki; Chertkov, Olga; Daligault, Hajnalka; Davenport, Karen; Gu, Wei; Munk, Christine; Zhang, Xiaojing; Bruce, David; Detter, Chris; Xu, Yan; Quintana, Beverly; Reitenga, Krista; Kunde, Yulia; Green, Lance; Erkkila, Tracy; Han, Cliff; Brambilla, Evelyne-Marie; Lang, Elke; Klenk, Hans-Peter; Goodwin, Lynne; Chain, Patrick; Das, Debabrata
2013-12-20
Enterobacter sp. IIT-BT 08 belongs to Phylum: Proteobacteria, Class: Gammaproteobacteria, Order: Enterobacteriales, Family: Enterobacteriaceae. The organism was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India. It has been extensively studied for fermentative hydrogen production because of its high hydrogen yield. For further enhancement of hydrogen production by strain development, complete genome sequence analysis was carried out. Sequence analysis revealed that the genome was linear, 4.67 Mbp long and had a GC content of 56.01%. The genome properties encode 4,393 protein-coding and 179 RNA genes. Additionally, a putative pathway of hydrogen production was suggested based on the presence of formate hydrogen lyase complex and other related genes identified in the genome. Thus, in the present study we describe the specific properties of the organism and the generation, annotation and analysis of its genome sequence as well as discuss the putative pathway of hydrogen production by this organism.
Tanaka, Mizuki; Sakai, Yoshifumi; Yamada, Osamu; Shintani, Takahiro; Gomi, Katsuya
2011-01-01
To investigate 3′-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3′-untranslated region (3′ UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3′ UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3′ UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15–30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3′-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3′-end-processing signals are similar to those in yeast and plants, some notable differences exist between them. PMID:21586533
Experimental Evidence and In Silico Identification of Tryptophan Decarboxylase in Citrus Genus.
De Masi, Luigi; Castaldo, Domenico; Pignone, Domenico; Servillo, Luigi; Facchiano, Angelo
2017-02-11
Plant tryptophan decarboxylase (TDC) converts tryptophan into tryptamine, precursor of indolealkylamine alkaloids. The recent finding of tryptamine metabolites in Citrus plants leads to hypothesize the existence of TDC activity in this genus. Here, we report for the first time that, in Citrus x limon seedlings, deuterium labeled tryptophan is decarboxylated into tryptamine, from which successively deuterated N , N , N -trimethyltryptamine is formed. These results give an evidence of the occurrence of the TDC activity and the successive methylation pathway of the tryptamine produced from the tryptophan decarboxylation. In addition, with the aim to identify the genetic basis for the presence of TDC, we carried out a sequence similarity search for TDC in the Citrus genomes using as a probe the TDC sequence reported for the plant Catharanthus roseus . We analyzed the genomes of both Citrus clementina and Citrus sinensis , available in public database, and identified putative protein sequences of aromatic l-amino acid decarboxylase. Similarly, 42 aromatic l-amino acid decarboxylase sequences from 23 plant species were extracted from public databases. Potential sequence signatures for functional TDC were then identified. With this research, we propose for the first time a putative protein sequence for TDC in the genus Citrus .
Park, Yun-Jong; Koh, Jin; Gauna, Adrienne E.; Chen, Sixue; Cha, Seunghee
2014-01-01
Patients with Sjögren’s syndrome or head and neck cancer patients who have undergone radiation therapy suffer from severe dry mouth (xerostomia) due to salivary exocrine cell death. Regeneration of the salivary glands requires a better understanding of regulatory mechanisms by which stem cells differentiate into exocrine cells. In our study, bone marrow-derived mesenchymal stem cells were co-cultured with primary salivary epithelial cells from C57BL/6 mice. Co-cultured bone marrow-derived mesenchymal stem cells clearly resembled salivary epithelial cells, as confirmed by strong expression of salivary gland epithelial cell-specific markers, such as alpha-amylase, muscarinic type 3 receptor, aquaporin-5, and cytokeratin 19. To identify regulatory factors involved in this differentiation, transdifferentiated mesenchymal stem cells were analyzed temporarily by two-dimensional-gel-electrophoresis, which detected 58 protein spots (>1.5 fold change, p<0.05) that were further categorized into 12 temporal expression patterns. Of those proteins only induced in differentiated mesenchymal stem cells, ankryin-repeat-domain-containing-protein 56, high-mobility-group-protein 20B, and transcription factor E2a were selected as putative regulatory factors for mesenchymal stem cell transdifferentiation based on putative roles in salivary gland development. Induction of these molecules was confirmed by RT-PCR and western blotting on separate sets of co-cultured mesenchymal stem cells. In conclusion, our study is the first to identify differentially expressed proteins that are implicated in mesenchymal stem cell differentiation into salivary gland epithelial cells. Further investigation to elucidate regulatory roles of these three transcription factors in mesenchymal stem cell reprogramming will provide a critical foundation for a novel cell-based regenerative therapy for patients with xerostomia. PMID:25402494
Hanson, Sara J; Stelzer, Claus-Peter; Welch, David B Mark; Logsdon, John M
2013-06-19
Sexual reproduction is a widely studied biological process because it is critically important to the genetics, evolution, and ecology of eukaryotes. Despite decades of study on this topic, no comprehensive explanation has been accepted that explains the evolutionary forces underlying its prevalence and persistence in nature. Monogonont rotifers offer a useful system for experimental studies relating to the evolution of sexual reproduction due to their rapid reproductive rate and close relationship to the putatively ancient asexual bdelloid rotifers. However, little is known about the molecular underpinnings of sex in any rotifer species. We generated mRNA-seq libraries for obligate parthenogenetic (OP) and cyclical parthenogenetic (CP) strains of the monogonont rotifer, Brachionus calyciflorus, to identify genes specific to both modes of reproduction. Our differential expression analysis identified receptors with putative roles in signaling pathways responsible for the transition from asexual to sexual reproduction. Differential expression of a specific copy of the duplicated cell cycle regulatory gene CDC20 and specific copies of histone H2A suggest that such duplications may underlie the phenotypic plasticity required for reproductive mode switch in monogononts. We further identified differential expression of genes involved in the formation of resting eggs, a process linked exclusively to sex in this species. Finally, we identified transcripts from the bdelloid rotifer Adineta ricciae that have significant sequence similarity to genes with higher expression in CP strains of B. calyciflorus. Our analysis of global gene expression differences between facultatively sexual and exclusively asexual populations of B. calyciflorus provides insights into the molecular nature of sexual reproduction in rotifers. Furthermore, our results offer insight into the evolution of obligate asexuality in bdelloid rotifers and provide indicators important for the use of monogononts as a model system for investigating the evolution of sexual reproduction.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.
Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T
1996-10-31
Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Evolutionary profiles from the QR factorization of multiple sequence alignments
Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida
2005-01-01
We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.
Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P
2005-01-01
We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.
Glinsky, Gennadi V
2018-03-01
Transposable elements have made major evolutionary impacts on creation of primate-specific and human-specific genomic regulatory loci and species-specific genomic regulatory networks (GRNs). Molecular and genetic definitions of human-specific changes to GRNs contributing to development of unique to human phenotypes remain a highly significant challenge. Genome-wide proximity placement analysis of diverse families of human-specific genomic regulatory loci (HSGRL) identified topologically associating domains (TADs) that are significantly enriched for HSGRL and designated rapidly evolving in human TADs. Here, the analysis of HSGRL, hESC-enriched enhancers, super-enhancers (SEs), and specific sub-TAD structures termed super-enhancer domains (SEDs) has been performed. In the hESC genome, 331 of 504 (66%) of SED-harboring TADs contain HSGRL and 68% of SEDs co-localize with HSGRL, suggesting that emergence of HSGRL may have rewired SED-associated GRNs within specific TADs by inserting novel and/or erasing existing non-coding regulatory sequences. Consequently, markedly distinct features of the principal regulatory structures of interphase chromatin evolved in the hESC genome compared to mouse: the SED quantity is 3-fold higher and the median SED size is significantly larger. Concomitantly, the overall TAD quantity is increased by 42% while the median TAD size is significantly decreased (p = 9.11E-37) in the hESC genome. Present analyses illustrate a putative global role for transposable elements and HSGRL in shaping the human-specific features of the interphase chromatin organization and functions, which are facilitated by accelerated creation of novel transcription factor binding sites and new enhancers driven by targeted placement of HSGRL at defined genomic coordinates. A trend toward the convergence of TAD and SED architectures of interphase chromatin in the hESC genome may reflect changes of 3D-folding patterns of linear chromatin fibers designed to enhance both regulatory complexity and functional precision of GRNs by creating predominantly a single gene (or a set of functionally linked genes) per regulatory domain structures. Collectively, present analyses reveal critical evolutionary contributions of transposable elements and distal enhancers to creation of thousands primate- and human-specific elements of a chromatin folding code, which defines the 3D context of interphase chromatin both restricting and facilitating biological functions of GRNs.
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.
Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal
2010-01-01
Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
The pine Pschi4 promoter directs wound-induced transcription
Haiguo Wu; Charles H. Michler; Liborio LaRussa; John M. Davis
1999-01-01
Mechanical wounding stimulates the accumulation of Pschi4 transcripts (encoding a putative extracellular chitinase) in pine trees. To gain insight into the transcriptional regulatory region(s) in this gymnosperm defense gene, the 5'-flanking region of Pschi4 was fused to the uidA reporter gene encoding -...
The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity
Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.
2015-01-01
Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease. PMID:26332131
Genome mining of ascomycetous fungi reveals their genetic potential for ergot alkaloid production.
Gerhards, Nina; Matuschek, Marco; Wallwey, Christiane; Li, Shu-Ming
2015-06-01
Ergot alkaloids are important as mycotoxins or as drugs. Naturally occurring ergot alkaloids as well as their semisynthetic derivatives have been used as pharmaceuticals in modern medicine for decades. We identified 196 putative ergot alkaloid biosynthetic genes belonging to at least 31 putative gene clusters in 31 fungal species by genome mining of the 360 available genome sequences of ascomycetous fungi with known proteins. Detailed analysis showed that these fungi belong to the families Aspergillaceae, Clavicipitaceae, Arthrodermataceae, Helotiaceae and Thermoascaceae. Within the identified families, only a small number of taxa are represented. Literature search revealed a large diversity of ergot alkaloid structures in different fungi of the phylum Ascomycota. However, ergot alkaloid accumulation was only observed in 15 of the sequenced species. Therefore, this study provides genetic basis for further study on ergot alkaloid production in the sequenced strains.
Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai
2014-09-03
MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive study of broccoli miRNA in relation to salt stress has been performed. We report significant data on the miRNA profile of broccoli that will underpin further studies on stress responses in broccoli and related species. The differential regulation of miRNAs between control and salt-stressed broccoli indicates that miRNAs play an integral role in the regulation of responses to salt stress.
Chromosome-encoded narrow-spectrum Ambler class A beta-lactamase GIL-1 from Citrobacter gillenii.
Naas, Thierry; Aubert, Daniel; Ozcan, Ayla; Nordmann, Patrice
2007-04-01
A novel beta-lactamase gene was cloned from the whole-cell DNA of an enterobacterial Citrobacter gillenii reference strain that displayed a weak narrow-spectrum beta-lactam-resistant phenotype and was expressed in Escherichia coli. It encoded a clavulanic acid-inhibited Ambler class A beta-lactamase, GIL-1, with a pI value of 7.5 and a molecular mass of ca. 29 kDa. GIL-1 had the highest percent amino acid sequence identity with TEM-1 and SHV-1, 77%, and 67%, respectively, and only 46%, 31%, and 32% amino acid sequence identity with CKO-1 (C. koseri), CdiA1 (C. diversus), and SED-1 (C. sedlaki), respectively. The substrate profile of the purified GIL-1 was similar to that of beta-lactamases TEM-1 and SHV-1. The blaGIL-1 gene was chromosomally located, as revealed by I-CeuI experiments, and was constitutively expressed at a low level in C. gillenii. No gene homologous to the regulatory ampR genes of chromosomal class C beta-lactamases was found upstream of the blaGIL-1 gene, which fits the noninducibility of beta-lactamase expression in C. gillenii. Rapid amplification of DNA 5' ends analysis of the promoter region revealed putative promoter sequences that diverge from what has been identified as the consensus sequence in E. coli. The blaGIL-1 gene was part of a 5.5-kb DNA fragment bracketed by a 9-bp duplication and inserted between the d-lactate dehydrogenase gene and the ydbH genes; this DNA fragment was absent in other Citrobacter species. This work further illustrates the heterogeneity of beta-lactamases in Citrobacter spp., which may indicate that the variability of Citrobacter species is greater than expected.
Analysis of plant-derived miRNAs in animal small RNA datasets
2012-01-01
Background Plants contain significant quantities of small RNAs (sRNAs) derived from various sRNA biogenesis pathways. Many of these sRNAs play regulatory roles in plants. Previous analysis revealed that numerous sRNAs in corn, rice and soybean seeds have high sequence similarity to animal genes. However, exogenous RNA is considered to be unstable within the gastrointestinal tract of many animals, thus limiting potential for any adverse effects from consumption of dietary RNA. A recent paper reported that putative plant miRNAs were detected in animal plasma and serum, presumably acquired through ingestion, and may have a functional impact in the consuming organisms. Results To address the question of how common this phenomenon could be, we searched for plant miRNAs sequences in public sRNA datasets from various tissues of mammals, chicken and insects. Our analyses revealed that plant miRNAs were present in the animal sRNA datasets, and significantly miR168 was extremely over-represented. Furthermore, all or nearly all (>96%) miR168 sequences were monocot derived for most datasets, including datasets for two insects reared on dicot plants in their respective experiments. To investigate if plant-derived miRNAs, including miR168, could accumulate and move systemically in insects, we conducted insect feeding studies for three insects including corn rootworm, which has been shown to be responsive to plant-produced long double-stranded RNAs. Conclusions Our analyses suggest that the observed plant miRNAs in animal sRNA datasets can originate in the process of sequencing, and that accumulation of plant miRNAs via dietary exposure is not universal in animals. PMID:22873950
Alcántara, Cristina; Sarmiento-Rubiano, Luz Adriana; Monedero, Vicente; Deutscher, Josef; Pérez-Martínez, Gaspar; Yebra, María J.
2008-01-01
Sequence analysis of the five genes (gutRMCBA) downstream from the previously described sorbitol-6-phosphate dehydrogenase-encoding Lactobacillus casei gutF gene revealed that they constitute a sorbitol (glucitol) utilization operon. The gutRM genes encode putative regulators, while the gutCBA genes encode the EIIC, EIIBC, and EIIA proteins of a phosphoenolpyruvate-dependent sorbitol phosphotransferase system (PTSGut). The gut operon is transcribed as a polycistronic gutFRMCBA messenger, the expression of which is induced by sorbitol and repressed by glucose. gutR encodes a transcriptional regulator with two PTS-regulated domains, a galactitol-specific EIIB-like domain (EIIBGat domain) and a mannitol/fructose-specific EIIA-like domain (EIIAMtl domain). Its inactivation abolished gut operon transcription and sorbitol uptake, indicating that it acts as a transcriptional activator. In contrast, cells carrying a gutB mutation expressed the gut operon constitutively, but they failed to transport sorbitol, indicating that EIIBCGut negatively regulates GutR. A footprint analysis showed that GutR binds to a 35-bp sequence upstream from the gut promoter. A sequence comparison with the presumed promoter region of gut operons from various firmicutes revealed a GutR consensus motif that includes an inverted repeat. The regulation mechanism of the L. casei gut operon is therefore likely to be operative in other firmicutes. Finally, gutM codes for a conserved protein of unknown function present in all sequenced gut operons. A gutM mutant, the first constructed in a firmicute, showed drastically reduced gut operon expression and sorbitol uptake, indicating a regulatory role also for GutM. PMID:18676710
Diversity of the P2 protein among nontypeable Haemophilus influenzae isolates.
Bell, J; Grass, S; Jeanteur, D; Munson, R S
1994-01-01
The genes for outer membrane protein P2 of four nontypeable Haemophilus influenzae strains were cloned and sequenced. The derived amino acid sequences were compared with the outer membrane protein P2 sequence from H. influenzae type b MinnA and the sequences of P2 from three additional nontypeable H. influenzae strains. The sequences were 76 to 94% identical. The sequences had regions with considerable variability separated by regions which were highly conserved. The variable regions mapped to putative surface-exposed loops of the protein. PMID:8188390
Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.
2009-01-01
Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary constraint. PMID:19506001
Zhao, Yinhe; Wang, Guoying; Zhang, Jinpeng; Yang, Junbo; Peng, Shang; Gao, Lianming; Li, Chengyun; Hu, Jinyong; Li, Dezhu; Gao, Lizhi
2006-07-01
Asarum caudigerum (Aristolochiaceae) is an important species of paleoherb in relation to understanding the origin and evolution of angiosperm flowers, due to its basal position in the angiosperms. The aim of this study was to isolate floral-related genes from A. caudigerum, and to infer evolutionary relationships among florally expression-related genes, to further illustrate the origin and diversification of flowers in angiosperms. A subtracted floral cDNA library was constructed from floral buds using suppression subtractive hybridization (SSH). The cDNA of floral buds and leaves at the seedling stage were used as a tester and a driver, respectively. To further identify the function of putative MADS-box transcription factors, phylogenetic trees were reconstructed in order to infer evolutionary relationships within the MADS-box gene family. In the forward-subtracted floral cDNA library, 1920 clones were randomly sequenced, from which 567 unique expressed sequence tags (ESTs) were obtained. Among them, 127 genes failed to show significant similarity to any published sequences in GenBank and thus are putatively novel genes. Phylogenetic analysis indicated that a total of 29 MADS-box transcription factors were members of the APETALA3(AP3) subfamily, while nine others were putative MADS-box transcription factors that formed a cluster with MADS-box genes isolated from Amborella, the basal-most angiosperm, and those from the gymnosperms. This suggests that the origin of A. caudigerum is intermediate between the angiosperms and gymnosperms.
Putative Monofunctional Type I Polyketide Synthase Units: A Dinoflagellate-Specific Feature?
Eichholz, Karsten; Beszteri, Bánk; John, Uwe
2012-01-01
Marine dinoflagellates (alveolata) are microalgae of which some cause harmful algal blooms and produce a broad variety of most likely polyketide synthesis derived phycotoxins. Recently, novel polyketide synthesase (PKS) transcripts have been described from the Florida red tide dinoflagellate Karenia brevis (gymnodiniales) which are evolutionarily related to Type I PKS but were apparently expressed as monofunctional proteins, a feature typical of Type II PKS. Here, we investigated expression units of PKS I-like sequences in Alexandrium ostenfeldii (gonyaulacales) and Heterocapsa triquetra (peridiniales) at the transcript and protein level. The five full length transcripts we obtained were all characterized by polyadenylation, a 3′ UTR and the dinoflagellate specific spliced leader sequence at the 5′end. Each of the five transcripts encoded a single ketoacylsynthase (KS) domain showing high similarity to K. brevis KS sequences. The monofunctional structure was also confirmed using dinoflagellate specific KS antibodies in Western Blots. In a maximum likelihood phylogenetic analysis of KS domains from diverse PKSs, dinoflagellate KSs formed a clade placed well within the protist Type I PKS clade between apicomplexa, haptophytes and chlorophytes. These findings indicate that the atypical PKS I structure, i.e., expression as putative monofunctional units, might be a dinoflagellate specific feature. In addition, the sequenced transcripts harbored a previously unknown, apparently dinoflagellate specific conserved N-terminal domain. We discuss the implications of this novel region with regard to the putative monofunctional organization of Type I PKS in dinoflagellates. PMID:23139807
Koebnik, Ralf; Krüger, Antje; Thieme, Frank; Urban, Alexander; Bonas, Ulla
2006-11-01
The pathogenicity of the plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria depends on a type III secretion system which is encoded by the 23-kb hrp (hypersensitive response and pathogenicity) gene cluster. Expression of the hrp operons is strongly induced in planta and in a special minimal medium and depends on two regulatory proteins, HrpG and HrpX. In this study, DNA affinity enrichment was used to demonstrate that the AraC-type transcriptional activator HrpX binds to a conserved cis-regulatory element, the plant-inducible promoter (PIP) box (TTCGC-N(15)-TTCGC), present in the promoter regions of four hrp operons. No binding of HrpX was observed when DNA fragments lacking a PIP box were used. HrpX also bound to a DNA fragment containing an imperfect PIP box (TTCGC-N(8)-TTCGT). Dinucleotide replacements in each half-site of the PIP box strongly decreased binding of HrpX, while simultaneous dinucleotide replacements in both half-sites completely abolished binding. Based on the complete genome sequence of Xanthomonas campestris pv. vesicatoria, putative plant-inducible promoters consisting of a PIP box and a -10 promoter motif were identified in the promoter regions of almost all HrpX-activated genes. Bioinformatic analyses and reverse transcription-PCR experiments revealed novel HrpX-dependent genes, among them a NUDIX hydrolase gene and several genes with a predicted role in the degradation of the plant cell wall. We conclude that HrpX is the most downstream component of the hrp regulatory cascade, which is proposed to directly activate most genes of the hrpX regulon via binding to corresponding PIP boxes.
Bhawna; Bonthala, V S; Gajula, Mnv Prasad
2016-01-01
The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely.Database URL: http://www.multiomics.in/PvTFDB/. © The Author(s) 2016. Published by Oxford University Press.
Luis, Luis; Serrano, María Luisa; Hidalgo, Mariana; Mendoza-León, Alexis
2013-01-01
Differential susceptibility to microtubule agents has been demonstrated between mammalian cells and kinetoplastid organisms such as Leishmania spp. and Trypanosoma spp. The aims of this study were to identify and characterize the architecture of the putative colchicine binding site of Leishmania spp. and investigate the molecular basis of colchicine resistance. We cloned and sequenced the β-tubulin gene of Leishmania (Viannia) guyanensis and established the theoretical 3D model of the protein, using the crystallographic structure of the bovine protein as template. We identified mutations on the Leishmania β-tubulin gene sequences on regions related to the putative colchicine-binding pocket, which generate amino acid substitutions and changes in the topology of this region, blocking the access of colchicine. The same mutations were found in the β-tubulin sequence of kinetoplastid organisms such as Trypanosoma cruzi, T. brucei, and T. evansi. Using molecular modelling approaches, we demonstrated that conformational changes include an elongation and torsion of an α-helix structure and displacement to the inside of the pocket of one β-sheet that hinders access of colchicine. We propose that kinetoplastid organisms show resistance to colchicine due to amino acids substitutions that generate structural changes in the putative colchicine-binding domain, which prevent colchicine access. PMID:24083244
Quarta, Angela; Mita, Giovanni; Durante, Miriana; Arlorio, Marco; De Paolis, Angelo
2013-07-01
The polyphenol oxidase (PPO) enzyme, which can catalyze the oxidation of phenolics to quinones, has been reported to be involved in undesirable browning in many plant foods. This phenomenon is particularly severe in artichoke heads wounded during the manufacturing process. A full-length cDNA encoding for a putative polyphenol oxidase (designated as CsPPO) along with a 1432 bp sequence upstream of the starting ATG codon was characterized for the first time from [Cynara cardunculus var. scolymus (L.) Fiori]. The 1764 bp CsPPO sequence encodes a putative protein of 587 amino acids with a calculated molecular mass of 65,327 Da and an isoelectric point of 5.50. Analysis of the promoter region revealed the presence of cis-acting elements, some of which are putatively involved in the response to light and wounds. Expression analysis of the gene in wounded capitula indicated that CsPPO was significantly induced after 48 h, even though the browning process had started earlier. This suggests that the early browning event observed in artichoke heads was not directly related to de novo mRNA synthesis. Finally, we provide the complete gene sequence encoding for polyphenol oxidase and the upstream regulative region in artichoke. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Lipase gene (lip) of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing bacterium P. resinovorans NRRL B-2649 was cloned, sequenced and characterized by using consensus primers and PCR-based genome walking method. The ORF of the putative Lip (314 amino acids) and its active site (Ser111, Asp...
Neuropeptidomics of the Mosquito Aedes Aegypti
2010-01-01
translational processing ( pyroglutamate formation) was detected for AST-C and CAPA-PVK-2. For the first time in insects, we succeeded in the direct...hormones, trace DNA sequences generated by TIGR and the Broad Institute were first searched by TBLASTN24 using amino acid sequences of candidate peptides...previously described.1 TBLASTN searches, using the amino acid sequences of putative Ae. aegypti neuropeptide and peptide hormone orthologs identified in
Draft genome sequence of Therminicola potens strain JR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.
'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.
Complete Genome Sequence of a Putative New Bacterial Strain, I507, Isolated from the Indian Ocean
Wang, Shu-yan; Wei, Jia-qiang
2018-01-01
ABSTRACT Bacterial strain I507 was isolated from the central Indian Ocean and may be a potential novel species, according to the 16S rRNA gene sequence. Here, we present its complete genome sequence and expect that it will provide researchers with valuable information to further understand its classification and function in the future. PMID:29674539
Hara, Yasushi; Hayashi, Kyohei; Nakajima, Takuya; Kagawa, Shizuko; Tazumi, Akihiro; Moore, John E; Matsuda, Motoo
2013-09-01
Clustered regularly interspaced short palindromic repeats (CRISPRs), of approximately 10,000 base pairs (bp) in length, were shown to occur in the Japanese Taylorella equigenitalis strain, EQ59. The locus was composed of the putative CRISPRs-associated with 5 (cas5), RAMP csd1, csd2, recB, cas1, a leader region, 13 CRISPR consensus sequence repeats (each 32 bp; 5'-TCAGCCACGTTCGCGTGGCTGTGTGTTTAAAG-3'). These were in turn separated by 12 non repetitive unique spacer regions of similar length. In addition, a leader region, a transposase/IS protein, a leader region, and cas3 were also seen. All seven putative open reading frames carry their ribosome binding sites. Promoter consensus sequences at the -35 and -10 regions and putative intrinsic ρ-independent transcription terminator regions also occurred. A possible long overlap of 170 bp in length occurred between the recB and cas1 loci. Positive reverse transcription PCR signals of cas5, RAMP csd1, csd2-recB/cas1, and cas3 were generated. A putative secondary structure of the CRISPR consensus repeats was constructed. Following this, CRISPR results of the T. equigenitalis EQ59 isolate were subsequently compared with those from the Taylorella asinigenitalis MCE3 isolate.
Audit, Benjamin; Zaghloul, Lamia; Vaillant, Cédric; Chevereau, Guillaume; d'Aubenton-Carafa, Yves; Thermes, Claude; Arneodo, Alain
2009-01-01
For years, progress in elucidating the mechanisms underlying replication initiation and its coupling to transcriptional activities and to local chromatin structure has been hampered by the small number (approximately 30) of well-established origins in the human genome and more generally in mammalian genomes. Recent in silico studies of compositional strand asymmetries revealed a high level of organization of human genes around 1000 putative replication origins. Here, by comparing with recently experimentally identified replication origins, we provide further support that these putative origins are active in vivo. We show that regions ∼300-kb wide surrounding most of these putative replication origins that replicate early in the S phase are hypersensitive to DNase I cleavage, hypomethylated and present a significant enrichment in genomic energy barriers that impair nucleosome formation (nucleosome-free regions). This suggests that these putative replication origins are specified by an open chromatin structure favored by the DNA sequence. We discuss how this distinctive attribute makes these origins, further qualified as ‘master’ replication origins, priviledged loci for future research to decipher the human spatio-temporal replication program. Finally, we argue that these ‘master’ origins are likely to play a key role in genome dynamics during evolution and in pathological situations. PMID:19671527
Scolari, Francesca; Gomulski, Ludvik M.; Ribeiro, José M. C.; Siciliano, Paolo; Meraldi, Alice; Falchetto, Marco; Bonomi, Angelica; Manni, Mosè; Gabrieli, Paolo; Malovini, Alberto; Bellazzi, Riccardo; Aksoy, Serap; Gasperi, Giuliano; Malacrida, Anna R.
2012-01-01
Background Insect seminal fluid is a complex mixture of proteins, carbohydrates and lipids, produced in the male reproductive tract. This seminal fluid is transferred together with the spermatozoa during mating and induces post-mating changes in the female. Molecular characterization of seminal fluid proteins in the Mediterranean fruit fly, Ceratitis capitata, is limited, although studies suggest that some of these proteins are biologically active. Methodology/Principal Findings We report on the functional annotation of 5914 high quality expressed sequence tags (ESTs) from the testes and male accessory glands, to identify transcripts encoding putative secreted peptides that might elicit post-mating responses in females. The ESTs were assembled into 3344 contigs, of which over 33% produced no hits against the nr database, and thus may represent novel or rapidly evolving sequences. Extraction of the coding sequences resulted in a total of 3371 putative peptides. The annotated dataset is available as a hyperlinked spreadsheet. Four hundred peptides were identified with putative secretory activity, including odorant binding proteins, protease inhibitor domain-containing peptides, antigen 5 proteins, mucins, and immunity-related sequences. Quantitative RT-PCR-based analyses of a subset of putative secretory protein-encoding transcripts from accessory glands indicated changes in their abundance after one or more copulations when compared to virgin males of the same age. These changes in abundance, particularly evident after the third mating, may be related to the requirement to replenish proteins to be transferred to the female. Conclusions/Significance We have developed the first large-scale dataset for novel studies on functions and processes associated with the reproductive biology of Ceratitis capitata. The identified genes may help study genome evolution, in light of the high adaptive potential of the medfly. In addition, studies of male recovery dynamics in terms of accessory gland gene expression profiles and correlated remating inhibition mechanisms may permit the improvement of pest management approaches. PMID:23071645
Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter
2013-01-01
Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)
Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn
2009-01-01
Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
Previously unknown and highly divergent ssDNA viruses populate the oceans.
Labonté, Jessica M; Suttle, Curtis A
2013-11-01
Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.
Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.
2011-01-01
We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875
Weiserová, Marie; Ryu, Junichi
2008-06-27
Type I restriction-modification (R-M) systems are the most complex restriction enzymes discovered to date. Recent years have witnessed a renaissance of interest in R-M enzymes Type I. The massive ongoing sequencing programmes leading to discovery of, so far, more than 1 000 putative enzymes in a broad range of microorganisms including pathogenic bacteria, revealed that these enzymes are widely represented in nature. The aim of this study was characterisation of a putative R-M system EcoA0ORF42P identified in the commensal Escherichia coli A0 34/86 (O83: K24: H31) strain, which is efficiently used at Czech paediatric clinics for prophylaxis and treatment of nosocomial infections and diarrhoea of preterm and newborn infants. We have characterised a restriction-modification system EcoA0ORF42P of the commensal Escherichia coli strain A0 34/86 (O83: K24: H31). This system, designated as EcoAO83I, is a new functional member of the Type IB family, whose specificity differs from those of known Type IB enzymes, as was demonstrated by an immunological cross-reactivity and a complementation assay. Using the plasmid transformation method and the RM search computer program, we identified the DNA recognition sequence of the EcoAO83I as GGA(8N)ATGC. In consistence with the amino acids alignment data, the 3' TRD component of the recognition sequence is identical to the sequence recognized by the EcoEI enzyme. The A-T (modified adenine) distance is identical to that in the EcoAI and EcoEI recognition sites, which also indicates that this system is a Type IB member. Interestingly, the recognition sequence we determined here is identical to the previously reported prototype sequence for Eco377I and its isoschizomers. Putative restriction-modification system EcoA0ORF42P in the commensal Escherichia coli strain A0 34/86 (O83: K24: H31) was found to be a member of the Type IB family and was designated as EcoAO83I. Combination of the classical biochemical and bacterial genetics approaches with comparative genomics might contribute effectively to further classification of many other putative Type-I enzymes, especially in clinical samples.
Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong
2015-09-02
Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
Halmillawewa, Anupama P; Restrepo-Córdoba, Marcela; Perry, Benjamin J; Yost, Christopher K; Hynes, Michael F
2016-02-01
Bacteriophages may play an important role in regulating population size and diversity of the root nodule symbiont Rhizobium leguminosarum, as well as participating in horizontal gene transfer. Although phages that infect this species have been isolated in the past, our knowledge of their molecular biology, and especially of genome composition, is extremely limited, and this lack of information impacts on the ability to assess phage population dynamics and limits potential agricultural applications of rhizobiophages. To help address this deficit in available sequence and biological information, the complete genome sequence of the Myoviridae temperate phage PPF1 that infects R. leguminosarum biovar viciae strain F1 was determined. The genome is 54,506 bp in length with an average G+C content of 61.9 %. The genome contains 94 putative open reading frames (ORFs) and 74.5 % of these predicted ORFs share homology at the protein level with previously reported sequences in the database. However, putative functions could only be assigned to 25.5 % (24 ORFs) of the predicted genes. PPF1 was capable of efficiently lysogenizing its rhizobial host R. leguminosarum F1. The site-specific recombination system of the phage targets an integration site that lies within a putative tRNA-Pro (CGG) gene in R. leguminosarum F1. Upon integration, the phage is capable of restoring the disrupted tRNA gene, owing to the 50 bp homologous sequence (att core region) it shares with its rhizobial host genome. Phage PPF1 is the first temperate phage infecting members of the genus Rhizobium for which a complete genome sequence, as well as other biological data such as the integration site, is available.
Osato, Naoki
2018-01-19
Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
Primate-specific evolution of an LDLR enhancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qian-Fei; Prabhakar, Shyam; Wang, Qianben
2005-12-01
Sequence changes in regulatory regions have often been invoked to explain phenotypic divergence among species, but molecular examples of this have been difficult to obtain. In this study we identified an anthropoid primate-specific sequence element that contributed to the regulatory evolution of the low-density lipoprotein receptor. Using a combination of close and distant species genomic sequence comparisons coupled with in vivo and in vitro studies, we found that a functional cholesterol-sensing sequence motif arose and was fixed within a pre-existing enhancer in the common ancestor of anthropoid primates. Our study demonstrates one molecular mechanism by which ancestral mammalian regulatory elementsmore » can evolve to perform new functions in the primate lineage leading to human.« less
The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.
Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir
2015-08-06
Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Castresana, C; Garcia-Luque, I; Alonso, E; Malik, V S; Cashmore, A R
1988-01-01
We have analyzed promoter regulatory elements from a photoregulated CAB gene (Cab-E) isolated from Nicotiana plumbaginifolia. These studies have been performed by introducing chimeric gene constructs into tobacco cells via Agrobacterium tumefaciens-mediated transformation. Expression studies on the regenerated transgenic plants have allowed us to characterize three positive and one negative cis-acting elements that influence photoregulated expression of the Cab-E gene. Within the upstream sequences we have identified two positive regulatory elements (PRE1 and PRE2) which confer maximum levels of photoregulated expression. These sequences contain multiple repeated elements related to the sequence-ACCGGCCCACTT-. We have also identified within the upstream region a negative regulatory element (NRE) extremely rich in AT sequences, which reduces the level of gene expression in the light. We have defined a light regulatory element (LRE) within the promoter region extending from -396 to -186 bp which confers photoregulated expression when fused to a constitutive nopaline synthase ('nos') promoter. Within this region there is a 132-bp element, extending from -368 to -234 bp, which on deletion from the Cab-E promoter reduces gene expression from high levels to undetectable levels. Finally, we have demonstrated for a full length Cab-E promoter conferring high levels of photoregulated expression, that sequences proximal to the Cab-E TATA box are not replaceable by corresponding sequences from a 'nos' promoter. This contrasts with the apparent equivalence of these Cab-E and 'nos' TATA box-proximal sequences in truncated promoters conferring low levels of photoregulated expression. Images PMID:2901343
A perchlorate sensitive iodide transporter in frogs
Carr, Deborah L.; Carr, James A.; Willis, Ray E.; Pressley, Thomas A.
2008-01-01
Nucleotide sequence comparisons have identified a gene product in the genome database of African clawed frogs (Xenopus laevis) as a probable member of the solute carrier family of membrane transporters. To confirm its identity as a putative iodide transporter, we examined the function of this sequence after heterologous expression in mammalian cells. A green monkey kidney cell line transfected with the Xenopus nucleotide sequence had significantly greater 125I uptake than sham-transfected control cells. The uptake in carrier-transfected cells was significantly inhibited in the presence of perchlorate, a competitive inhibitor of mammalian Na+/iodide symporter. Tissue distributions of the sequence were also consistent with a role in iodide uptake. The mRNA encoding the carrier was found to be expressed in the thyroid gland, stomach, and kidney of tadpoles from X. laevis, as well as the bullfrog Rana catesbeiana. The ovaries of adult X. laevis also were found to express the carrier. Phylogenetic analysis suggested that the putative X. laevis iodide transporter is orthologous to vertebrate Na+-dependent iodide symporters. We conclude that the amphibian sequence encodes a protein that is indeed a functional Na+/iodide symporter in Xenopus laevis, as well as Rana catesbeiana. PMID:18275962
Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-Graciamedrano, Rosa Maria
2013-01-01
Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100-4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa.
Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential
Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael
2013-01-01
Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328
Background: A large quantity of nitrogen (N) fertilizer is used for crop production to achieve high yields at a significant economic and environmental cost. Efforts have been directed to understanding the molecular basis of plant responses to N and to identifying N-responsive gen...
Wüthrich, Daniel; Bruggmann, Rémy; Berthoud, Hélène; Arias-Roth, Emmanuelle
2015-01-01
Clostridium tyrobutyricum is the main microorganism responsible for late blowing defect in cheeses. Here, we present the draft genome sequences of two C. tyrobutyricum strains isolated from a Swiss semihard red-smear cheese. The two draft genomes comprise 3.05 and 3.08 Mbp and contain 3,030 and 3,089 putative coding sequences, respectively. PMID:25767226
Identification of regulatory targets for the bacterial Nus factor complex.
Baniulyte, Gabriele; Singh, Navjot; Benoit, Courtney; Johnson, Richard; Ferguson, Robert; Paramo, Mauricio; Stringer, Anne M; Scott, Ashley; Lapierre, Pascal; Wade, Joseph T
2017-12-11
Nus factors are broadly conserved across bacterial species, and are often essential for viability. A complex of five Nus factors (NusB, NusE, NusA, NusG and SuhB) is considered to be a dedicated regulator of ribosomal RNA folding, and has been shown to prevent Rho-dependent transcription termination. Here, we identify an additional cellular function for the Nus factor complex in Escherichia coli: repression of the Nus factor-encoding gene, suhB. This repression occurs primarily by translation inhibition, followed by Rho-dependent transcription termination. Thus, the Nus factor complex can prevent or promote Rho activity depending on the gene context. Conservation of putative NusB/E binding sites upstream of Nus factor genes suggests that Nus factor autoregulation occurs in many bacterial species. Additionally, many putative NusB/E binding sites are also found upstream of other genes in diverse species, and we demonstrate Nus factor regulation of one such gene in Citrobacter koseri. We conclude that Nus factors have an evolutionarily widespread regulatory function beyond ribosomal RNA, and that they are often autoregulatory.
Comparative Genomics as a Foundation for Evo-Devo Studies in Birds.
Grayson, Phil; Sin, Simon Y W; Sackton, Timothy B; Edwards, Scott V
2017-01-01
Developmental genomics is a rapidly growing field, and high-quality genomes are a useful foundation for comparative developmental studies. A high-quality genome forms an essential reference onto which the data from numerous assays and experiments, including ChIP-seq, ATAC-seq, and RNA-seq, can be mapped. A genome also streamlines and simplifies the development of primers used to amplify putative regulatory regions for enhancer screens, cDNA probes for in situ hybridization, microRNAs (miRNAs) or short hairpin RNAs (shRNA) for RNA interference (RNAi) knockdowns, mRNAs for misexpression studies, and even guide RNAs (gRNAs) for CRISPR knockouts. Finally, much can be gleaned from comparative genomics alone, including the identification of highly conserved putative regulatory regions. This chapter provides an overview of laboratory and bioinformatics protocols for DNA extraction, library preparation, library quantification, and genome assembly, from fresh or frozen tissue to a draft avian genome. Generating a high-quality draft genome can provide a developmental research group with excellent resources for their study organism, opening the doors to many additional assays and experiments.
Upadhyay, Atul Kumar; Sowdhamini, Ramanathan
2016-01-01
3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.
2013-01-01
Backgroud Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive dataset of I. indigotica. Results A database of 36,367 unigenes (average length = 1,115.67 bases) was generated by performing transcriptome sequencing. Based on the gene annotation of the transcriptome, 104 unigenes were identified covering most of the catalytic steps in the general biosynthetic pathways of indole, terpenoid, and phenylpropanoid. Subsequently, the organ-specific expression patterns of the genes involved in these pathways, and their responses to methyl jasmonate (MeJA) induction, were investigated. Metabolites profile of effective phenylpropanoid showed accumulation pattern of secondary metabolites were mostly correlated with the transcription of their biosynthetic genes. According to the analysis of UDP-dependent glycosyltransferases (UGT) family, several flavonoids were indicated to exist in I. indigotica and further identified by metabolic profile using UPLC/Q-TOF. Moreover, applying transcriptome co-expression analysis, nine new, putative UGTs were suggested as flavonol glycosyltransferases and lignan glycosyltransferases. Conclusions This database provides a pool of candidate genes involved in biosynthesis of effective metabolites in I. indigotica. Furthermore, the comprehensive analysis and characterization of the significant pathways are expected to give a better insight regarding the diversity of chemical composition, synthetic characteristics, and the regulatory mechanism which operate in this medical herb. PMID:24308360
Pedrini, Nicolás; Zhang, Shizhu; Juárez, M Patricia; Keyhani, Nemat O
2010-08-01
The insect epicuticle or waxy layer comprises a heterogeneous mixture of lipids that include abundant levels of long-chain alkanes, alkenes, wax esters and fatty acids. This structure represents the first barrier against microbial attack and for broad-host-range insect pathogens, such as Beauveria bassiana, it is the initial interface mediating the host-pathogen interaction, since these organisms do not require any specialized mode of entry and infect target hosts via the cuticle. B. bassiana is able to grow on straight chain alkanes up to n-C(33) as a sole source of carbon and energy. The cDNA and genomic sequences, including putative regulatory elements, for eight cytochrome P450 enzymes, postulated to be involved in alkane and insect epicuticle degradation, were isolated and characterized. Expression studies using a range of alkanes as well as an insect-derived epicuticular extract from the blood-sucking bug Triatomas infestans revealed a differential expression pattern for the P450 genes examined, and suggest that B. bassiana contains a series of hydrocarbon-assimilating enzymes with overlapping specificity in order to target the surface lipids of insect hosts. Phylogenetic analysis of the translated ORFs of the sequences revealed that the enzyme which displayed the highest levels of induction on both alkanes and the insect epicuticular extract represents the founding member of a new cytochrome P450 family, with three of the other sequences assigned as the first members of new P450 subfamilies. The remaining four proteins clustered with known P450 families whose members include alkane monooxygenases.
Regulatory elements in vivo in the promoter of the abscisic acid responsive gene rab17 from maize.
Busk, P K; Jensen, A B; Pagès, M
1997-06-01
The rab17 gene from maize is transcribed in late embryonic development and is responsive to abscisic acid and water stress in embryo and vegetative tissues. In vivo footprinting and transient transformation of rab17 were performed in embryos and vegetative tissues to characterize the cis-elements involved in regulation of the gene. By in vivo footprinting, protein binding was observed to nine elements in the promoter, which correspond to five putative ABREs (abscisic acid responsive elements) and four other sequences. The footprints indicated that distinct proteins interact with these elements in the two developmental stages. In transient transformation, six of the elements were important for high level expression of the rab17 promoter in embryos, whereas only three elements were important in leaves. The cis-acting sequences can be divided in embryo-specific, ABA-specific and leaf-specific elements on the basis of protein binding and the ability to confer expression of rab17. We found one positive, new element, called GRA, with the sequence CACTGGCCGCCC. This element was important for transcription in leaves but not in embryos. Two other non-ABRE elements that stimulated transcription from the rab17 promoter resemble previously described abscisic acid and drought-inducible elements. There were differences in protein binding and function of the five ABREs in the rab17 promoter. The possible reasons for these differences are discussed. The in vivo data obtained suggest that an embryo-specific pathway regulates transcription of the rab genes during development, whereas another pathway is responsible for induction in response to ABA and drought in vegetative tissues.
USDA-ARS?s Scientific Manuscript database
Complementing quantitative methods with sequence data analysis is a major goal of the post-genome era of biology. In this study, we analyzed Illumina HiSeq sequence data derived from 11 US Holstein bulls in order to identify putative causal mutations associated with calving and conformation traits. ...
USDA-ARS?s Scientific Manuscript database
Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...
USDA-ARS?s Scientific Manuscript database
The phylogeny of Amaryllidaceae tribe Hippeastreae was inferred using chloroplast (3’ycf1, ndhF, trnL-F) and nuclear (ITS rDNA) sequence data under maximum parsimony and maximum likelihood frameworks. Network analyses were applied to resolve conflicting signals among data sets and putative scenarios...
USDA-ARS?s Scientific Manuscript database
The complete genomic sequence of a novel putative member of the genus Potyvirus was detected from Callistephus chinensis (china aster) in South Korea. The genomic RNA consists of 9,859 nucleotides excluding the 3’ poly(A) tail. The Callistephus virus genome, which contains the typical open reading f...
USDA-ARS?s Scientific Manuscript database
Multi-locus sequence analysis has been demonstrated to be a useful tool for identification of Streptomyces species and was previously applied to phylogenetically differentiate the type strains of species pathogenic on potatoes (Solanum tuberosum L.). The ARS Culture Collection (NRRL) contains 43 str...
PlantTFDB: a comprehensive plant transcription factor database
Guo, An-Yuan; Chen, Xin; Gao, Ge; Zhang, He; Zhu, Qi-Hui; Liu, Xiao-Chuan; Zhong, Ying-Fu; Gu, Xiaocheng; He, Kun; Luo, Jingchu
2008-01-01
Transcription factors (TFs) play key roles in controlling gene expression. Systematic identification and annotation of TFs, followed by construction of TF databases may serve as useful resources for studying the function and evolution of transcription factors. We developed a comprehensive plant transcription factor database PlantTFDB (http://planttfdb.cbi.pku.edu.cn), which contains 26 402 TFs predicted from 22 species, including five model organisms with available whole genome sequence and 17 plants with available EST sequences. To provide comprehensive information for those putative TFs, we made extensive annotation at both family and gene levels. A brief introduction and key references were presented for each family. Functional domain information and cross-references to various well-known public databases were available for each identified TF. In addition, we predicted putative orthologs of those TFs among the 22 species. PlantTFDB has a simple interface to allow users to search the database by IDs or free texts, to make sequence similarity search against TFs of all or individual species, and to download TF sequences for local analysis. PMID:17933783
Li, Jitao; Li, Jian; Chen, Ping; Liu, Ping; He, Yuying
2015-01-01
The ridgetail white prawn Exopalaemon carinicauda is one of major economic mariculture species in eastern China. The deficiency of genomic and transcriptomic data is becoming the bottleneck of further researches on its good traits. In the present study, 454 pyrosequencing was undertaken to investigate the transcriptome profiles of E. carinicauda. A collection of 1,028,710 sequence reads (459.59 Mb) obtained from cDNA prepared from eyestalk and hemocytes was assembled into 162,056 expressed sequence tags (ESTs). Of these, 29.88 % of 48,428 contigs and 70.12 % of 113,628 singlets possessed high similarities to sequences in the GenBank non-redundant database, with most significant (E value <1e(-10)) unigenes matches occurring with crustacean and insect sequences. KEGG analysis of unigenes identified putative members of biological pathways related to growth and immunity. In addition, we obtained a total of putative 125,112 SNPs and 13,467 microsatellites. These results will contribute to the understanding of the genome makeup and provide useful information for future functional genomic research in E. carinicauda.
Peng, Jing; Peng, Futian; Zhu, Chunfu; Wei, Shaochong
2008-06-01
A putative isopentenyltransferase (IPT) encoding gene was identified from a pingyitiancha (Malus hupehensis Rehd.) expressed sequence tag database, and the full-length gene was cloned by RACE. Based on expression profile and sequence alignment, the nucleotide sequence of the clone, named MhIPT3, was most similar to AtIPT3, an IPT gene in Arabidopsis. The full-length cDNA contained a 963-bp open reading frame encoding a protein of 321 amino acids with a molecular mass of 37.3 kDa. Sequence analysis of genomic DNA revealed the absence of introns in the frame. Quantitative real-time PCR analysis demonstrated that the gene was expressed in roots, stems and leaves. Application of nitrate to roots of nitrogen-deprived seedlings strongly induced expression of MhIPT3 and was accompanied by the accumulation of cytokinins, whereas MhIPT3 expression was little affected by ammonium application to roots of nitrogen-deprived seedlings. Application of nitrate to leaves also up-regulated the expression of MhIPT3 and corresponded closely with the accumulation of isopentyladenine and isopentyladenosine in leaves.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ris-Stalpers, C.; Verleun-Mooijman, M.C.T.; Blaeij, T.J.P. de
1994-04-01
The analysis of the androgen receptor (AR) gene, mRNA, and protein in a subject with X-linked Reifenstein syndrome (partial androgen insensitivity) is reported. The presence of two mature AR transcripts in genital skin fibroblasts of the patient is established, and, by reverse transcriptase-PCR and RNase transcription analysis, the wild-type transcript and a transcript in which exon 3 sequences are absent without disruption of the translational reading frame are identified. Sequencing and hybridization analysis show a deletion of >6 kb in intron 2 of the human AR gene, starting 18 bp upstream of exon 3. The deletion includes the putative branch-pointmore » sequence (BPS) but not the acceptor splice site on the intron 2/exon 3 boundary. The deletion of the putative intron 2 BPS results in 90% inhibition of wild-type splicing. The mutant transcript encodes an AR protein lacking the second zinc finger of the DNA-binding domain. Western/immunoblotting analysis is used to show that the mutant AR protein is expressed in genital skin fibroblasts of the patient. The residual 10% wild-type transcript can be the result of the use of a cryptic BPS located 63 bp upstream of the intron 2/exon 3 boundary of the mutant AR gene. The mutated AR protein has no transcription-activating potential and does not influence the transactivating properties of the wild-type AR, as tested in cotransfection studies. It is concluded that the partial androgen-insensitivity syndrome of this patient is the consequence of the limited amount of wild-type AR protein expressed in androgen target cells, resulting from the deletion of the intron 2 putative BPS. 42 refs., 6 figs., 1 tab.« less
Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.
2015-03-22
Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.
Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean ( Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we usemore » site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less
Henry, Kelli F; Kawashima, Tomokazu; Goldberg, Robert B
2015-06-01
Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globular-stage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. A homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.
BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations
Wang, Junbai; Batmanov, Kirill
2015-01-01
Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972
IRREGULAR POLLEN EXINE1 Is a Novel Factor in Anther Cuticle and Pollen Exine Formation.
Chen, Xiaoyang; Zhang, Hua; Sun, Huayue; Luo, Hongbing; Zhao, Li; Dong, Zhaobin; Yan, Shuangshuang; Zhao, Cheng; Liu, Renyi; Xu, Chunyan; Li, Song; Chen, Huabang; Jin, Weiwei
2017-01-01
Anther cuticle and pollen exine are protective barriers for pollen development and fertilization. Despite that several regulators have been identified for anther cuticle and pollen exine development in rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana), few genes have been characterized in maize (Zea mays) and the underlying regulatory mechanism remains elusive. Here, we report a novel male-sterile mutant in maize, irregular pollen exine1 (ipe1), which exhibited a glossy outer anther surface, abnormal Ubisch bodies, and defective pollen exine. Using map-based cloning, the IPE1 gene was isolated as a putative glucose-methanol-choline oxidoreductase targeted to the endoplasmic reticulum. Transcripts of IPE1 were preferentially accumulated in the tapetum during the tetrad and early uninucleate microspore stage. A biochemical assay indicated that ipe1 anthers had altered constituents of wax and a significant reduction of cutin monomers and fatty acids. RNA sequencing data revealed that genes implicated in wax and flavonoid metabolism, fatty acid synthesis, and elongation were differentially expressed in ipe1 mutant anthers. In addition, the analysis of transfer DNA insertional lines of the orthologous gene in Arabidopsis suggested that IPE1 and their orthologs have a partially conserved function in male organ development. Our results showed that IPE1 participates in the putative oxidative pathway of C16/C18 ω-hydroxy fatty acids and controls anther cuticle and pollen exine development together with MALE STERILITY26 and MALE STERILITY45 in maize. © 2017 American Society of Plant Biologists. All Rights Reserved.
Identification and Cloning of gusA, Encoding a New β-Glucuronidase from Lactobacillus gasseri ADH†
Russell, W. M.; Klaenhammer, T. R.
2001-01-01
The gusA gene, encoding a new β-glucuronidase enzyme, has been cloned from Lactobacillus gasseri ADH. This is the first report of a β-glucuronidase gene cloned from a bacterial source other than Escherichia coli. A plasmid library of L. gasseri chromosomal DNA was screened for complementation of an E. coli gus mutant. Two overlapping clones that restored β-glucuronidase activity in the mutant strain were sequenced and revealed three complete and two partial open reading frames. The largest open reading frame, spanning 1,797 bp, encodes a 597-amino-acid protein that shows 39% identity to β-glucuronidase (GusA) of E. coli K-12 (EC 3.2.1.31). The other two complete open reading frames, which are arranged to be separately transcribed, encode a putative bile salt hydrolase and a putative protein of unknown function with similarities to MerR-type regulatory proteins. Overexpression of GusA was achieved in a β-glucuronidase-negative L. gasseri strain by expressing the gusA gene, subcloned onto a low-copy-number shuttle vector, from the strong Lactobacillus P6 promoter. GusA was also expressed in E. coli from a pET expression system. Preliminary characterization of the GusA protein from crude cell extracts revealed that the enzyme was active across an acidic pH range and a broad temperature range. An analysis of other lactobacilli identified β-glucuronidase activity and gusA homologs in other L. gasseri isolates but not in other Lactobacillus species tested. PMID:11229918
Asha, Srinivasan; Soniya, E V
2017-02-01
Small RNAs derived from ribosomal RNAs (srRNAs) are rarely explored in the high-throughput data of plant systems. Here, we analyzed srRNAs from the deep-sequenced small RNA libraries of Piper nigrum, a unique magnoliid plant. The 5' end of the putative long form of 5.8S rRNA (5.8S L rRNA) was identified as the site for biogenesis of highly abundant srRNAs that are unique among the Piperaceae family of plants. A subsequent comparative analysis of the ninety-seven sRNAomes of diverse plants successfully uncovered the abundant existence and precise cleavage of unique rRF signature small RNAs upstream of a novel 5' consensus sequence of the 5.8S rRNA. The major cleavage process mapped identically among the different tissues of the same plant. The differential expression and cleavage of 5'5.8S srRNAs in Phytophthora capsici infected P. nigrum tissues indicated the critical biological functions of these srRNAs during stress response. The non-canonical short hairpin precursor structure, the association with Argonaute proteins, and the potential targets of 5'5.8S srRNAs reinforced their regulatory role in the RNAi pathway in plants. In addition, this novel lineage specific small RNAs may have tremendous biological potential in the taxonomic profiling of plants.
Asha, Srinivasan; Soniya, E. V.
2017-01-01
Small RNAs derived from ribosomal RNAs (srRNAs) are rarely explored in the high-throughput data of plant systems. Here, we analyzed srRNAs from the deep-sequenced small RNA libraries of Piper nigrum, a unique magnoliid plant. The 5′ end of the putative long form of 5.8S rRNA (5.8SLrRNA) was identified as the site for biogenesis of highly abundant srRNAs that are unique among the Piperaceae family of plants. A subsequent comparative analysis of the ninety-seven sRNAomes of diverse plants successfully uncovered the abundant existence and precise cleavage of unique rRF signature small RNAs upstream of a novel 5′ consensus sequence of the 5.8S rRNA. The major cleavage process mapped identically among the different tissues of the same plant. The differential expression and cleavage of 5′5.8S srRNAs in Phytophthora capsici infected P. nigrum tissues indicated the critical biological functions of these srRNAs during stress response. The non-canonical short hairpin precursor structure, the association with Argonaute proteins, and the potential targets of 5′5.8S srRNAs reinforced their regulatory role in the RNAi pathway in plants. In addition, this novel lineage specific small RNAs may have tremendous biological potential in the taxonomic profiling of plants. PMID:28145468
Hornung, Claudia; Poehlein, Anja; Haack, Frederike S.; Schmidt, Martina; Dierking, Katja; Pohlen, Andrea; Schulenburg, Hinrich; Blokesch, Melanie; Plener, Laure; Jung, Kirsten; Bonge, Andreas; Krohn-Molt, Ines; Utpatel, Christian; Timmermann, Gabriele; Spieck, Eva; Pommerening-Röser, Andreas; Bode, Edna; Bode, Helge B.; Daniel, Rolf; Schmeisser, Christel; Streit, Wolfgang R.
2013-01-01
Janthinobacteria commonly form biofilms on eukaryotic hosts and are known to synthesize antibacterial and antifungal compounds. Janthinobacterium sp. HH01 was recently isolated from an aquatic environment and its genome sequence was established. The genome consists of a single chromosome and reveals a size of 7.10 Mb, being the largest janthinobacterial genome so far known. Approximately 80% of the 5,980 coding sequences (CDSs) present in the HH01 genome could be assigned putative functions. The genome encodes a wealth of secretory functions and several large clusters for polyketide biosynthesis. HH01 also encodes a remarkable number of proteins involved in resistance to drugs or heavy metals. Interestingly, the genome of HH01 apparently lacks the N-acylhomoserine lactone (AHL)-dependent signaling system and the AI-2-dependent quorum sensing regulatory circuit. Instead it encodes a homologue of the Legionella- and Vibrio-like autoinducer (lqsA/cqsA) synthase gene which we designated jqsA. The jqsA gene is linked to a cognate sensor kinase (jqsS) which is flanked by the response regulator jqsR. Here we show that a jqsA deletion has strong impact on the violacein biosynthesis in Janthinobacterium sp. HH01 and that a jqsA deletion mutant can be functionally complemented with the V. cholerae cqsA and the L. pneumophila lqsA genes. PMID:23405110
Evolution and expression analysis of the grape (Vitis vinifera L.) WRKY gene family.
Guo, Chunlei; Guo, Rongrong; Xu, Xiaozhao; Gao, Min; Li, Xiaoqin; Song, Junyang; Zheng, Yi; Wang, Xiping
2014-04-01
WRKY proteins comprise a large family of transcription factors that play important roles in plant defence regulatory networks, including responses to various biotic and abiotic stresses. To date, no large-scale study of WRKY genes has been undertaken in grape (Vitis vinifera L.). In this study, a total of 59 putative grape WRKY genes (VvWRKY) were identified and renamed on the basis of their respective chromosome distribution. A multiple sequence alignment analysis using all predicted grape WRKY genes coding sequences, together with those from Arabidopsis thaliana and tomato (Solanum lycopersicum), indicated that the 59 VvWRKY genes can be classified into three main groups (I-III). An evaluation of the duplication events suggested that several WRKY genes arose before the divergence of the grape and Arabidopsis lineages. Moreover, expression profiles derived from semiquantitative PCR and real-time quantitative PCR analyses showed distinct expression patterns in various tissues and in response to different treatments. Four VvWRKY genes showed a significantly higher expression in roots or leaves, 55 responded to varying degrees to at least one abiotic stress treatment, and the expression of 38 were altered following powdery mildew (Erysiphe necator) infection. Most VvWRKY genes were downregulated in response to abscisic acid or salicylic acid treatments, while the expression of a subset was upregulated by methyl jasmonate or ethylene treatments.